Vol. XXXIV. Parts III and IV December, 1947 


BIOMETRIKA 


A JOURNAL FOR THE STATISTICAL STUDY OF 
BIOLOGICAL PROBLEMS 


FOUPED BY 
W. F. R. WELDON, FRANCIS \|ALTON anp KARL PEARSON 


EDI(?D BY 
EGON S. PEARSON 


IN CONSULTATION WITH 
HARALD CRAMER J. B. S. HALDANE 
R. C. GEARY G. M. MORANT 
MAJOR GREENWOOD JOHN WISHART 


ISSUED BY THE BIOMETRIKA OFFICE 
UNIVERSITY COLLEGE, LONDON 


AND PRINTED AT THE 
UNIVERSITY PRESS, CAM®t-RIDGE 


Reprinted by offset-litho 1964 
[Issued 30 December, 1947] 











Vol 








col 










VoLuME XXXIV, Parts III anp IV DECEMBER 1947 











ON THE DISTRIBUTION OF THE RANK CORRELATION 
COEFFICIENT +t WHEN THE VARIATES ARE 
NOT INDEPENDENT 


By WASSILY HOFFDING 


I. IntrRopvuction 


1. Consider a population distributed according to two variates z, y. Two members 
(21, ¥,) and (2, y,) of the population will be called concordant if both values of one member 
are greater than the corresponding values of the other one, that is if 


Uy <%q, Yy< Yq OF Xy>X, Yy>Yo- 


They will be called discordant if for one member one value is greater and the other one smaller 
than for the other member, that is if 


%<S%q Yr Yq OF %>Xq, Yi <Ys- 
The probability p that two members drawn from the population at random without 
replacement are concordant will be called the probability of concordance, the probability 
q that they are discordant will be called the probability of discordance. 


In the following only populations will be considered for which the probabilities of z, = 7, 
or ¥; = Yo are zero, so that p+q=l. (1) 


The main types of such populations are (a) an infinite population with both z and y 
distributed continuously, (5) a finite population where all values of x and all values of y are 
different among themselves. The condition that the two members are drawn without replace- 
ment is, of course, only relevant in case (6). 

For a sample of n members drawn from the population, the probabilities of concordance 
and discordance are defined in the same manner as for the population. They will be denoted 
by p’ and q’ to distinguish them from the population values. If for the population (1) is 
fulfilled, it may be assumed that all values of x and all values of y in the sample are different, 


so that p'+q =1. (2) 
It follows from the definition that p’ is the relative frequency of concordant pairs among 


the (;) pairs which can be formed from the members of the sample. 


The probability of concordance expresses an essential property of a bivariate distribution. 
It may in itself be considered as a measure of correlation. p’ is an estimate of p; it will be 
shown that the mean value of p’ is p. If a coefficient lying between the limits — 1 and +1 is 
preferred, the quantity T= p'—q' = 2p'-1 (3) 
may be taken. 

2. The quantity p here termed the probability of concordance was apparently first 
considered by Esscher (1924) who also used the quantity 

n-1 n 


| Deu () x > sign (x;—2,)sign (y;—y;), 


j=1Li=j+1 


Biometrika 34 13 











184 Distribution of the rank correlation coefficient t 


(where z;, y;, i = 1,...,n, are the sample values of the variates) which is the same as the 
coefficient 7 as defined by (3). Esscher showed that if z and y are normally correlated with 
correlation coefficient r, the expectation of D = 7 is 


E(r) = = sin r. (4) 


Hence, from this equation, hesuggested estimatingr from ranked data by means of the relation 


7 


f= sin57 = sina (p’—}). 
For the variance of r Esscher found in the case of a normally distributed population 
l(n n—2{l (2. .(r\\? 
=. rs seal eet fet SS 7 bel 5 
a(a) rary = a2 ™"fo-(e™"(9)) © 
9 2 
where 4pq = 1~ (- sin-! r) . 


While Esscher saw in p’ and D = 7 only a means for estimating r, Lindeberg (1926) stressed 
the significance of the probability of concordance itself for judging the degree of dependence 
between the variates. He proposed for that purpose the coefficient 

P = 100p’ — 50 = 507, 
called by him Korrelationsprozent. Lindeberg also gave, without proof, a formula for the 
variance of p’ in the general case of correlated variates (see (13) below). 

Jordan (1927) suggested using, instead of- Lindeberg’s P, the coefficient later termed by 
Kendall r. 

Kendall (1938), independently of the above authors, proposed 7 as a measure of rank 
correlation. He completely solved the problem of the sampling distribution of 7 in a universe 


in which all possible rankings are equally probable, showing that it rapidly tends to 
normality for increasing n. 


3. The main object of this paper is to show that the sampling distribution of p’ (and hence 
that of r) tends to normality as noo for any population with continuously distributed x 
and y if a certain condition is fulfilled (Part IV). In addition, Lindeberg’s formula for 
the variance of p’ is proved (Part IT) and extended for a finite population (Part V). Finally, 
in Part VI the problem of estimating var (p’) from the sample is considered. 


II. MEAN VALUE AND VARIANCE OF p’ IN THE CASE OF AN INF.NITE POPULATION 


4. Consider a sample of n drawn at random from an infinite population with continuous 
x and y. Replace the values of x and of y in the sample by their ranks and arrange the 
members of the sample so that the ranking of z is 1,2,...,n. Then the ranking of y is a 
permutation 
of the numbers 1, ..., 2. 


Let I and J be the numbers of inversions in the permutations (7,,,7,_,,...,7,) and 
(m7, ...,7,). Then 27 ; 2J 


’ 


? aa-1) { *ael) (6) 


Thus the knowledge of the permutation II corresponding to the given sample is sufficient 
for evaluating p’. 


II = (7,, ...,%,) 














iS 


6) 


nt 











Wassity H6rrpine 185 


5. Let P(II) be the probability of drawing a random sample represented by the per- 
mutation II. [ot p’(I1) be the probability of concordance for such a sample. Then 


p = =P(II) p’(1l), (7) 
where the sum is extended over all permutations II of mn numbers. 
The right-hand side of (7) is equal to the mean value of p’. Hence 


Ep’ = p. (8) 


Consider, in generalization of p’, the probability w’ that among m<n members drawn 
from the sample at random without replacement, certain pairs of members are concordant: 
for instance, among four members A, B, C, D, the pairs AB, AC, AD; or the pairs AB, CD, 
etc. Let w be the corresponding probability for the parent population. Then it is seen in 
the same manner as with p’ that 


Ew’ = w. (9) 


Thus, if we can express (p’)“, the probability of drawing ~ concordant pairs from the 
sample, replacing each pair after drawing it, by probabilities without replacement of the 
type w’, we can also, in virtue of (9), represent H(p’)* by population parameters of the type w. 


6. Now, (p’), the probability of drawing from the sample one concordant pair and, after 
replacing it, of drawing again a concordant pair, is the sum of the following three probabilities: 

(a) the probability of getting the same pair in both drawings (2 / (3))- multiplied by 
the probability that this pair is concordant (p’); 

(6) the probability that the second pair has one member in common with the first pair 
(2- 2) / (3)) , multiplied by the probability, say k’, that among three members A, B, C 
drawn from the sample without replacement, one, say A, is concordant with the other two; 

(c) the probability that the second pair has no member in common with the first one 
((" I / (3) , multiplied by the probability that among four members A, B, C, D drawn 


without replacement, two pairs without a member in common, say AB and CD, are con- 
cordant. The latter probability may be denoted by (*)’ since the corresponding probability 
for the infinite population is p*. 


Thus, (3) (p’)? = p’+2(n—2)k' + (" . ‘) (p)’, (10) 

and, applying (9), (:) E(p')* = p+2(n—2)k+ ey p. (11) 
Hence, we have for the variance of p’ 

(3) var(p') = (3) {E(p')?— p*} = p+ 2(n—2) k—(2n—3) p? (12) 

or (5) var(e') = (5) mate") = pol —p)+ 20-2) (kp (13) 


This is identical with the formula given without proof by Lindeberg (1926). 


7. In the case considered by Kendall where al]! permutations II of n numbers are equally 
probable, the permutations of m <n also are equiprobable. Hence 
p = P(1,2) =q = P(2,1) =}. 


13-2 








186 Distribution of the rank correlation coefficient + 


Further, representing k as the mean value of k’ in a sample of 3, we find 


k = P(123) + 4P(132) + }P(213) = (+S Ps = 
Inserting these values in (13), we have 
ar(p’) = 2n+5 
Pai felt 18n(n—1)’ 
: 2(2n+5 
var (7) = 4var(p’) = eae 


in accordance with Kendall’s formula. 


III. Some ALGEBRAIC FORMULAE 


8. We shall now consider some algebraic relations to be used in the proof of normality 
of p’ for large n. 
Let f,(p) be a polynomial of degree d in p. Then 


id 0 if d<f, 
Z(-()tle= (an it ane (14) 
where 4, is the coefficient of the highest power p* in f,(p). 
To prove (i4) write fale) = a,9'%+ a, p'e"+ ..., 
where p%=1, p= p(p—1)...(e—é+1), (é>1). 
Then (14) follows from the fact that 
os aie py aes Ca) 
is equal to #%(1 —1)*-* = 0 if B—é>0 and to f! if # = 4. 
9. For any non-negative integer v we may write 
n’ = da (n—a)"l+d9 (n—a)o-U+ ...+.d@)_, (n—a) +d. (15) 
We will study certain properties of the coefficients d@. 
From (15) it is seen immediately that 


d= 1, (16) 
Inserting in (15) n = a+ (8 = 0,1,...) we have 
(a+ BY = dy pB!+ dy) piBP +... +09 1B +dp (17) 
1 4-1 
or q=a, dy.= 7i\(2+AY- a prag_.}, (6 = 1,2,...). (18) 
: A=0 
Hence we find by induction 
1 £ B 
d)_ it = 1} ( ) Vv, 19 
a0 = wD (-IP (ate (19) 
If we take this as definition of d) for x < 0, we have in virtue of (14) 
d@=0 for «<0. (20) 


Expanding (a +)” we have from (19) 
) > (*) arp? = z ("\ars £ (- p-o(l)o-~. 


c=0 

















4) 


20) 














Wassity H6rrpine 187 
Comparing the last sum with (19) and writing 


d,,. ere a, 
we have dq _,= > (*) d,_¢ ypc” 
o=0 \9, s 
or, putting vy— # = x and noting that, by (20), d,_,,_, = 0 fora>k, 
d= & (2) deeot’ (21) 
c=0 \F ‘ 
We have the recurrence relation 
a x —d_) = = (a +v+1 —k) aq, (22) 


which can be obtained by multiplying (15) by n, then writing down (15) with v+ 1 instead 
of v, and comparing coefficients in both expressions. 
10. We prove now two properties of the coefficients d®. 
(I) d,,.is a polynomial in v of degree 2x, the term of highest degree being v**/2*x!. 
In virtue of (16) this is true for x = 0. And ifit is true for x — 1, the highest term of d,,, ,—d,, 
is, by (22) with a = 0, v*—1/2*—1(x—1)!, and hence that of d,,, by a well-known theorem, 
1 1 1 


2x 2(k—1)! i 
(II) djv;!, is a polynomial in ¢ of degree 2x with the highest term (— 1)*#*«/2*«!. 
: “£ [t- 
From (21), Ca ak i | ee 


In (' 4 the highest term in ¢ is BS t. 
o oa! 


, = 1 
In d)_,_o-¢ the highest term in ¢ is a lial (by (1)). 


In (y—t)* the highest term in ¢ is (— 1)7?. 
Hence, in dj7;° the highest term is 


_ (—2)? a K 2x (—1) ox 
Pn Folx—o)l ai" dhe pat 


11. d,,_» has also a combinatorial meaning. 
v! Z ; v! 

™ 2p”) = 2? OM = > yl 
where > indicates summation over all v; > 0, 2’ over all vy; > 1, and in both cases v, + ... + ¥g=v. 
Xz (v) is the number of ways of allocating v objects on # places, and &,(v) is the number of 
ways of allocating v objects on # places in such a manner that no place remains empty. 

We have X,(v) =’, 
and a little consideration shows that 


Ep (v) = Zp(v)+ (‘) Epa (V) +... + Fa :) =; (»). 
Comparing this with (17) we see that 


d,, v—~ — B! ie (v) = rl (23) 





188 Distribution of the rank correlation coefficient 1 


IV. PRoor OF NORMALITY OF p’ FOR n->00 


12. Any set of different pairs of elements belonging to the population will be briefly referred 
to as a system (two pairs being different if they have no more than one element in common). 

If we represent the elements of a system by points in a plane and the pairs of elements by 
lines joining the points, we have a pattern corresponding to the given system. Two systems 
will be said to have the same pattern if there exists a one-to-one correspondence between the 
elements of both systems such that if two elements of one system form a pair, the two corre- 
sponding elements of the other system also form a pair. Thus the only thing relevant in a 
pattern is the lines connecting the points, the position of the points having no significance. 

A pattern will be called simple if one can pass from any point of the pattern to any other 
one along lines belonging to the pattern. A composite pattern is a pattern consisting of more 
than one simple pattern. 

If the elements of a system (or the points of the corresponding pattern) are denoted 
by different letters A, B, C, ..., each pair of the system can be represented by a pair of 
letters. All systems of one pair have the same pattern (AB). There are two patterns 
of two pairs, one simple and containing three points (AB, BC) and one composite and 
containing four points (AB,CD). There are five patterns of three pairs, three simple 
(AB, BC,CA; AB, BC,CD; AB, AC, AD), one consisting of two different simple patterns 
(AB, CD, DE) and one consisting of three equal simple patterns (A B, CD, EF). 

13. Ifa simple pattern consists of a, points and 6, pairs, 

a;<b,+1. (24) 

For this is true for b; = 1, and by adding one pair to a simple pattern, at most one point 
is added if the new pattern is to be simple again. 

Denote the different simple patterns by S,, S,, ...,8;,..., where S, stands for the one-pair 


pattern and S, for the two-pairs pattern (AB, BC), all 8; with j >3 consisting of three or 
more pairs. Let a, be the number of points and b, the number of pairs in S;. Then a, = 2, 


a, = 3, b,=1, b=2, 623 if j>3. (25) 


Consider a pattern P composed of y, simple patterns S,, y, simple patterns S,, etc., and 
containing a points and 6 pairs. Then, writing symbolically 


P=Xy,;8;, 
we have a@=Xyj;4;, b= Xy,b,;. 
In virtue of (24), 36 — 2a = Ly,(3b, — 2a,) > Xy,(b; — 2), 
and from (25) 3b6—2a> —7,, (26) 


the sign of equality holding if, and only if, pattern P contains no other simple patterns than 
8, and &,. 


14. (p’) is the probability that ~ pairs of elements drawn from the sample, replacing 
each pair after drawing, are all concordant. We may write 
¢p’)* = 2 A, wi, 
where A, is the probability that » pairs are drawn from the sample in such a way that the 


system of different pairs among them has the pattern P;, and if a, is the number of points 
in P,, w;,is the probability that ifa,; elements are drawn from the sample without replacement 








and 
ove 


are 





26) 


an 


ng 


he 
its 
nt 





Wasstty H6rrpine 189 


and paired according to pattern P,, all pairs of P, are concordant. The summation is extended 
over all patterns P, with no more than yp pairs. 


Since the probabilities w; are of the type for which formula (9) is applicable, we have 
E(p' = DA,wy (27) 
where, as usual, w, is the population probability corresponding to the sample probability w;. 


15. Consider a term Aw in (27) corresponding to the pattern 


P=Xy7,8; 
with y = 2 y; simple patterns, a = Ly,a; points and b = Ly,b, pairs. 
Let P = >7;8; 
j>2 
be the pattern obtained from P by excluding the single-pair patterns S,. Then 
Y=Y-Y% G=a-2y, and b=b-y, (28) 


are the numbers of simple patterns, points and pairs in P. 
We have w = pry, (29) 
where v is independent of p and y,, only depending on the pattern P. 


16. The probability A will be studied, in the first place, as a function of n and y,, while 
its dependence on P will be considered later and only in a special case. It must be borne in 
mind that, by (28), y, a and 6 also depend on y,. 

Let Q,, Q,, .... Q, be the pairs of pattern P numbered in some definite order. Suppose 
pair Q, appears wv, times (f = 1, ...,6). Then 


My t... +My =H, Mpel (B=1,...,6). 

Let R,, Ry, ...,R, be the total set of the pairs drawn, numbered independently of the 
order in which they appear. Then 4, R’s are equal to Q,, u, R’s are equal to Q,, etc. 

Let B be the probability that among » pairs drawn from the sample, replacing each pair 
after drawing, 6 pairs are different and arranged according to pattern P, pair Q, appearing 
ft, times (f = 1, ...,6) and the yw pairs being drawn in a definite order, say R,, Ry, ..., R,. 

Suppose, R, is a Q,. Since any pair drawn may be taken as Q, (only the relative position 
of the pairs being relevant), the probability first to draw R, is 1. The probability that the 
second pair drawn is R, depends on whether R, has no, one or both elements in common 
with R,. In the first case, it is ¥ ‘) / (3) , in the second case 2(n — 2) / (3) (the factor 2 
arising from the fact that each of the two elements of R, can be the element common with 
R,), and in the third case, 1/(3) ' 


In general, if the first A pairs drawn are R,, ..., R,, and if they form a pattern P’ containing 
a different elements, the probability that the (A + 1)th pair drawn is R,,, depends on whether 


R,,, has no, one or both elements in common with P’. In the first case it is ee / (3) . 


in the second case, c’(n—«a) / (3): and in the third case, c” / (3): where c’ and c” are 
independert of n. If, in the last case; R,,, is equal to one of the préceding R’s, c” = 1. 








190 Distribution of the rank correlation coefficient 1 
B is the product of all ~ such probabilities, and it is seen from the above consideration 
that it is of the form m\~#+1 
B= C(n—2)e2 (3) 


where C is independent of n. 

We also see that a pair which has already appeared before makes no contribution to C. 
Hence, C only depends on the different pairs of pattern P, and is independent of the 
numbers /4,. 

The above reflexion further shows that for any simple pattern contained in P, the pair 
drawn first, having no elements in common with the preceding pairs, contributes to C the 
factor }, except for the first pair, R,, which yields the factor 1. Thus, C contains the factor 
2-7+1 = 2-n-7+1, and 2-% is obviously the sole contribution to C from the y, single pairs 
(pattern S,) contained in P. Hence 


B= 2-20" (n+ 2)e-4) ~ a , 


where C’ is independent of n and y,, and also independent of the order in which the y, 
single pairs are drawn. 

A, the probability that ~ pairs drawn form pattern P, irrespective of the order in which 
they appear, depends on n in the same way as B. As a function of y;, A contains, besides 
2-%, the factor 1/y,! owing to the fact that the y, single pairs are interchangeable. Further 
it contains the factor £;(~) which indicates the number of ways of allocating 4 objects on 
b places so that no place remains empty. In virtue of (23) we have 


Z5(H) =_b! a 





—pti 
Thus, A is of the form A = D®(n—2)2nt4-2) (3) ‘i , (30) 
b)! 
where DY = 2-" ne Yd, a (31) 


and D’ is independent of both n and y, and only depends on the pattern P containing no 8,. 
Inserting (29) and (30) in (27), we have 


= | ” 

(3)° Bley = EDppr o(n—2yere-m, (32) 
the summation taking place over all patterns with no more than yz pairs. (32) also holds 
for 4 = 0 if by a ‘pattern of 0 pairs’ we understand the case y, = y, = ... = 0 and take (31) 
as definition of D® with suitably chosen D’. p!—! with > 0 is defined by 

pp + 6)" = 1. 
17. If #,{p’) eos E(p’ —p), 
v-1 v & v—8-1 
h n "es _ af”) (”\" 3 (™ — 
we have (3) ate’) = 3 (-19(3)(3) 2°(5) Ble". 
Applying (32) with p=v-s, y,=K-34, (33) 


2 v—1 
we have for the coefficient of p*v in (3) B,(p") 


10) () saree 


8= 


v s 1 2 8 id 
ry 2, sith (:) 3 Ps Bik: (;) a®—-eDe—P(n — 29-8, 














de; 


th 
th 

















Wassity H6rrpine 191 


Inserting here, in accordance with (15), 


23—p 
no = Y) dgtes™ (n—G— 2K + 28-1, 


o=0 
= vy\1 & 8 2-—p _ 
wehave — ¥ (—1?(5) 35 & (-1()) DEP dite (n—2ya1m-2-r-o, 
es 8} 2° 5=o o=0 - 
Putting a =a@+2x—2-—p—o, we have for the coefficient K®~ of p*v(n—2)@) in 
n pl : < 
(3) me Keo = & (-1"(5) a2), (34) 
whe 2) (§ ! De-> : p é qi@t2x—28) 
re ag (8) = ye De- 2) —pat+2x—a—p-2 (35) 


Since dGt*=?)_._»-2 = 0 if @+ 2x—a—p—2<0, the upper limit of p in the summation 
may be taken as @+ 2x—a-—2, which is independent of 5. We have then, in virtue of (31), 
ee * G@+2x-—a-2 é 
afo%8) = 5.6 +K—aPd, 4, 5-<D" & (—1e()) agree (36) 
0 


S —p,at2x—a—p—2) 
where D’ is independent of 6. 
In virtue of (I) and (II), para. 10, a®-~(8) is a polynomial in 4. The degree of the (9 + 1)th 


term in the sum in (36) is p+ 2(@+ 2x—a—p-—2), which is highest for p = 0. Hence, the 
degree d of a®:*(8) is 


d = 6+2(v—b—x) + 2(@+ 2x—a—2) = 2v+a+xK—a—2)-—b. (37) 
Now, according to (34) anu (14), K®-” = 0 if d<_, or, in virtue of (37), 
K°®=0- if 2a>v—4+4+2a—6+4 2k. (38) 


Applying (26) for pattern P, we have, since y, = 0, 24 < 3b, and consequently 
2a —b + 2x < 2(b+x), 


the sign of equality holding if and only if pattern P contains no other simple patterns 
than S, and 8,. 


Remembering that, according to para. 16, b< 4, we have in virtue of (33) 


b+K=b4+y,4+80 =b4+8<p+s =v. (39) 
Thus in any case 24 —b + 2x < 2p 
and, in virtue of (38), Ke®=0 if a>zv-1. (40) 
If P contains at least one simple pattern with more than two pairs, we even have 
2a —b + 2x < 2», 
and consequently Ke®=0 if a> fv-2. (41) 


v—1 
From (40) it appears that the degree in n of (3) B,(p’) is 


<3h-2=$v-2 if v= 2h, 
<3h-—1 = 3»-$ if v=2h+1, 
Thus, in /,,,,(p’), if expanded in powers of n, the degree of the highest term is 
<3h-—1-—4h =—-h-1. 





192 Distribution of the rank correlation coefficient 1 
In 4,(p’), in virtue of (13), the degree of the highest term is — 1, provided that k— p?+0. 


Hence, the degree of ii air Hansa(P’) 
wi HYD’) 


is < —h—1+h+}4 =-—}. It follows that 
Cnii(p’)>0 if k—p?>0. (42) 


(k—p? <0 is impossible since in this case var (p’) would become <0 for large n.) 


18. As we have seen, we may write 
2h—1 
(2) Max(P') = Ry(n—2)-24. Ry(n— 28-814... 


Tuen it follows from (41) that R, only contains terms depending on patterns S, and S8,, 


that is, R, is of the form 
Ry = TUK pk’. (43) 
keaA 


-1 
The only terms in (3)" E(p’)* which can contribute to this sum are of the form 


DM prakeX(n — 2)211+82—2), 
The pattern corresponding to such a term is 
P = y,S,+A8,. 
Remembering the considerations in para. 16, we see that in each S, the pair drawn first 


contributes to C the factor } (except if it is R,), while the first drawing of the other pair 
yields the factor 2. Hence, the S,’s make no contribution to C, and we have 


C = 2-nt, 
The contribution of the patterns S, to A is twofold: since in each S, the two pairs may be 


interchanged, this gives the factor (1/2!)*; and since the A patterns S, may be interchanged, 
we have the factor 1/A!. Thus 
+ 2A)! 


_y a4 Va 
DY = 2-%1-A+1 vial Best 


and, in virtue of (31), since b = 2A, 


D = 3-5) (44) 
Inserting in (38) yv=2h, a=3h-2, G@=3A, b=2A, (45) 
we see that K?4.**-® + 0 is possible only if 
K+2A>2h. 
On the other hand, from (39), K+2A< 2h. 
Hence, K = 2h—2A. (46) 
Inserting this in (43), we have ; 
Ry = 3 KR P POOL, (41) 


2h 
where K moe = 2, (—1) (*;) agh.3h~2)(3), 























Wassity Hérrpine 193 
According to (37) in connexion with (45), (46), the degree in 6 of aZt*%~%(6) is 2h. The 
highest term, @,é*, is contained in the term corresponding to p = 0 in (36). Inserting in 
(36) the values from (44), (45) and (46) and putting in the sum p = 0, we have 
ee A oe dys, oMpn-a”- 
Thus, in virtue of (16) and (II), para. 10, 
1 Q2h—2a 1) (h 
ay = a a 1-4 ( s ( ) 





2-A(h—Ay BAY ay 
i —1)*-*(2h)! (h 
According to (14), Bh ah-2) — ia ad (,)- 


Inserting this in (47) we have 


(2h)! & _,[h * (2h)! 
=" 2h-1h! 2' afi (;) aiaioest 2—-1j! G-s'P- 





The highest term of ,,(p’) is thus 
on (2h)! 2\h »—h 
a a eo P ) se, 
that of u,(p’) is 4(k — p*) n—, and hence 
r) _ Man’) _ (2h)! 
tal?) ="aNp’) © Ph! 
From (42) and (48) it follows according to the Second Limit Theorem that the distribution 


of p’ tends to normality as n->0o, provided that the marginal distributions are continuous 
and k—p*>0. 


The condition k—p*> 0 is fulfilled if the population is distributed normally. For, com- 
paring Esscher’s formula (5) with (13), we find, since var (7) = 4 var(p’), 





if k—p*>0. (48) 


The right-hand side is positive if |r| <1. 


V. THE VARIANCE OF p’ IN THE CASE OF A FINITE POPULATION 


19. Consider a sample of » drawn from a finite population of N in which all values of z 
and all values of y are different. For the sample probabilities p’, k’, (p*)’, ... we write now 


p™, KY, (py, 
and for the corresponding population probabilities 
p™, KEY, (py, 
Equation (9) remains valid and may be written as follows: 
Ew = Ww, (49) 
In particular, Ep™= p™, 


The essential difference between this case and the case N = 00 considered above is that the 
composite probabilities such as (p*)™ or (pk)™ are not equal to (p™)® or pV, For 








194 Distribution of the rank correlation coefficient 1 


instance, (p)? is evidently the same function of p™, K™), (p*)™ and N as (p’)? is of p’, k’, 
(p*)’ and n. Thus we have, replacing n by N in (10), 


(p™)? = a + As Ce a (p2)™, (50) 
and hence (p?) = a hae at (51) 


On the other hand, from (10) and (49) 
2 
E(p™)* = ~a PO + 


which is the equivalent of equation (11). 
On subtracting (50) from (52) we find 


es 2(N —n) 
vert) = el 





oa — 2/2) 
Ho) yn, OEP pa, (52) 





ji +8— 1) p™) + 2[Nn— 2(N +n—1)]k™ 
—[2Nn-—3(N+2—1)](p2)}. (53) 
Substituting for (p?)™, the expression in (51), we obtain 


var (9) = aa + n— 5) 1) 


+ 2(n—2)(N —2) (KY) — om), (54) 
or (3) var (p™) = (1- wos 5 iv 3) 3) 201 — p™) + 2(n— 2)(1- 5 =) u— pNP). (55) 
For N +00, (55) becomes the same as (13) 











VI. A SAMPLE ESTIMATE OF var (p’) 
20. In the case of an infinite population, let 


(3) var’ (p’) = p’ + 2(n— 2) k’ — (2n — 3) (p?)’. (56) 


Then, in virtue of (9) and (12), 
E var’ (p’) = var(p’). 
On inserting in (56) for (p*)’ the expression obtained from (10), we find 


("5 °) var’ (p') = p'(A—p')+2(0—2)(H'—(2'), 





2 
or var’ (P') = Gea Pd tea gh—(P'P- (57) 


2 4 
llq’\ = Pu ’ _(a’\2 
By analogy, var’ (1) = Goa) mana? dt na" — (58) | 
where I’ is the probability that among three members A, B, C drawn from the sample without 
replacement, one, say A, is discordant with the other two. 
In the case of a finite population of the type considered in para. 19, we define in a similar 
way a statistic var™ (p™) such that 


Evar™ (p™) = var (p™). 











——E 





Wassity H6Frpine 195 
We find 


2(N — 
RZ : - 4a =o 3) (Y +n—5) p™ (1 —p™) + 2(n — 2) (NW — 2) (KY — pi} 


(59) 


A comparison between (59) and (54) shows that var™(p™) is obtained from var (p'”) 
by interchanging n and N and taking the opposite sign. 


var” (94%) = 





21. Let g, and h, be the numbers of sample members concordant and discordant with 
A, = (x,,y,) (v = 1,...,n). The probability of drawing first the member A,, and then, 
without repiacing it, a member concordant with A, is He . The probability of 
drawing, without replacement, first A, and then two other members concordant with A, is 
1 9(9, at 1) 











n(n—1)(n—2)" Hence : g 
a Jy a Jy—=9, 
P = n(n—1)’ n(n — 1) (n—2)° (60) 
= i. ae ,.. BR=EA, 
ee T= nat) "= nw)" (61) 


If only the value of p’ or q’ is required, the use of (6) may be more expedient than that of 
(60) or (61). If, however, the variance, and hence k’ or/’, is wanted, the calculation by means 
of the numbers g, and h, (whose sums are twice the numbers of inversions J, J) according 
to (60) or (61) is to be preferred. 

If p’ > }, it is more convenient to calculate g’ and l’ from (61); if p’ < 4, the calculation of 
p’ and k’ by (60) is more rapid. In many cases one can see directly from the given data 
whether the concordant or the discordant pairs prevail, before actually calculating p’ or q’. 


Since p’+q' = 1, we have var (p’) = var (q’), 
and also, in the case of a finite population, 
var (p™) = var (g™). 
If we write down the equation for var (¢”) analogous to (55) and subtract it from (55), 
we have KN) — pNP — 0 4 GNF = 0, 
or HEN) — JN) = fF — gf NF = gf) — gf, 


Substituting n for NV, we have 
y ..7* _ p'—-q' -¥, 


Comparing this with (57) and (58) we see that 


var’ (p’) = var’ (q’). 








196 Distribution of the rank correlation coefficient + 


REFERENCES 


Esscuer, F. (1924). On a method of determining correlation from the ranks of the variates. Skand. 
Aktuar. 7, 201-19. 

JORDAN, CH. (1927). Statistique mathématique. Paris: Gauthiers-Villars. 

KENDALL, M. G. (1938). A new measure of rank correlation. Biometrika, 30, 81-93. 

LinDEBERG, J. W. (1926). Ueber die Korrelation. Den VI skandinaviske Matematikerkongres i Koben- 
havn, 31 August—4 September 1925, pp. 437-46. Kebenhavn: J. Gjellerup. 


ADDENDUM 


On p. 184 above, I quoted J. W. Lindeberg as having given the formula for the variance 
of the probability of concordance p’ without proof. I was not aware then that a proof of 
this formula, as well as that of the corresponding expression for a finite population 
(equation (54) of my paper), is contained in another paper by Lindeberg, ‘Some remarks 
on the mean error of the percentage of correlation,’ Nordié Statistical Journal, 1, 137-41 
(1929). 








ot 


6©& ams =—& © +S Oro ff 


~~ sh hCUh}h.DUlUCrECT|] 











[ 197 ] 


THE SIGNIFICANCE OF RANK CORRELATIONS WHERE 
PARENTAL CORRELATION EXISTS 


By H. E. DANIELS (Wool Industries Research Association) 
anpD M. G. KENDALL 


1. All the known tests of significance of rank correlation coefficients are based on dis- 
tributions from a population in which each possible ranking occurs equally frequently, 
i.e. the null case where no parental correlation exists. We may then say of any particular 
coefficient whether it is significant in the sense that it cannot have arisen with any acceptable 
probability from an uncorrelated population. No tests are known in the case where parental 
correlation exists, and we have not seen the point discussed except in reference to the 
replacement of rank correlations by grade or product-moment correlations. Thus, for 
example, if two rank correlation coefficients are both found to be significant there has 
hitherto been no exact method of deciding whether their difference is significant. In this 
paper we consider the problem of determining confidence intervals for a rank correlation 


when the parent is correlated and develop a test of significance for the difference of two 
correlations. 


2. In testing an ordinary product-moment correlation the problem is enormously 
simplified by the assumption that the population is normal, or the further assumption that 
normal theory holds good even when the parent deviates only moderately from normality. 
Apart from means and variances the population is then completely specified by the single 
parent parameter p and, as is well known, the sample distribution of the estimator depends 
only on p and the sample number n. 

In ranking theory this position no longer obtains. No assumption can in general be made 
about the form of the parent distribution and, in particular, the parent correlation does not 
completely specify the problem. The usual type of variate theory cannot, therefore, be 
expected to meet the requirements. 


3. A satisfactory approach to the problem can, however, be made if the rank correlation 
is measured by the coefficient known as Tt (Kendall, 1943, chap. 16). We shall then show that, 
for large samples at any rate, the problem admits of a solution. 

Let the population consist of N members. They may be imagined as laid out in the natural 
order 1, 2, ..., N according to the first variate. The rankings according to the second variate 
are then some permutation of the numbers | to N, and this second array of ranks is all we 
need write down in particular cases. It determines the rank correlation r. Now suppose 


we choose a sample of n in one of the (7) possible ways. This sample will, so far as the first 


variate is concerned, be in the natural order, and the ranks according to the second variate 
permit of the calculation of a sample correlation ¢. For all possible samples and any given 


arrangement of the parent members there will be a distribution of (*) values of t. 











198 Significance of rank correlations where parental correlation exists 

4. The sample value of ¢ is an unbiased estimator of 7; that is to say, the mean vaiue of 
¢ in all possible samples is 7. For consider the (*) samples of n. Any particular pair of 
members will occur in «€ : ) samples, that is, all pairs occur equally frequently in the 
totality of all samples. In calculating ¢ we assign to any pair +1 if its members are in the 
right order and — 1 in the contrary case. Thus the total of the score for all samples is .* ; ) 


times the score for the population. To obtain ¢ we divide the score for any sample by $n(n — 1), 
and to obtain 7 we divide the population score by 4N(N —1). Hence if = is the score for the 
population, the mean value (expectation) of t is 


N-2 5 
ae ace (*) w=" 


5. Unfortunately, it is not true that higher moments of ¢ depend only on 7. A single 
example will illustrate the point. Consider the ranking of 9: 


e232 :S- FOr 6 9. 4, 





2: (4-1) 


If the 84 = (3) possible samples of three are written down and ¢ evaluated for each, the 


distribution of S (the number of positive pairs) is found to be as follows: 





Values of S Frequency 








0 2 
1 15 
2 34 
3 33 
Total 84 














The mean of this distribution is 182/84 = 13/6, and since 
28 
‘= Fn(n—1)” 
the mean value of ¢ is (26/18) — 1 = 0-44. The value of S for the parent ranking is 26 and hence 
T = (52/36) —1 = 0-44, verifying equation (4-1). The ranking 
- & ee S te 
also has tr = 0-44, but the distribution of S in samples of three is now: 








Values of S Frequency 
0 3 
1 16 
2 29 
3 36 





Total 84 























ll ee oe a eee eee 


1, the 


hence 











H. E. Dantrets anp M. G. KEnDALL 199 


The second moment of this distribution is 5-429, against 5-333 for the first distribution, the 
variances being 0-734 against 0-639. 


6. Thus for any parent with given 7 there is in general more than one sampling distribution 
of ¢ according to the arrangement of the parent ranks. In short, as mentioned above, the 
parameter 7 does not completely specify the sampling distribution and in asking the question: 
What is the standard error of ¢? we are seeking for an answer which does not exist. 

It will be shown, however, that for any given parent ranking the distribution of ¢ tends to 
normality with increasing n. The sampling properties of ¢ can therefore be specified to a 
first approximation by its first and second moments only, when the samples are not too small. 
Further, it will be proved that for given 7 the variance of ¢ cannot exceed a certain function 
of 7 and n whatever the parent ranking. From a knowledge of ¢ and » only, it is thus possible 
to set outer bounds to confidence intervals for r provided n is large enough for the normal 
approximation to hold. The limits obtained in this way are sometir -s rather wide, and an 
alternative procedure is to estimate the true variance of ¢ directly from the sample itself 
according to a formula given below. This avoids the loss of efficiency consequent on using an 
upper limit to the variance, but it is not known how large a sample is required for the error 
of estimation to be tolerable. 


7. The development of the theory is facilitated if we introduce at the present stage a 
notation similar to that used by- Daniels (1944). 'The ith and jth ranks corresponding to the 
second variate are together assigned a score a,; which takes the value +1 if the members 
are in the correct order, — 1 if in the wrong order, and a,, is defined to be zero. The ranks for 
the first variate are similarly assigned scores 6;;, but as the members have been taken in the 
correct order for this variate, the scores are simply b;; = + 1, i 5j; 6,; = 0. Next we define 


C4; = @,;6,;, so that c;; = +1, according to whether the ranks for the two variates agree or 
differ in order, and c;; = 0. In this notation 


tT =c/N(N-1), 


where c = Xc,;, i and j both being summed from | to N. 

When the sample of n pairs is selected at random from the parent N and its coefficient 
t is calculated, the values of c,; for the members of the sample remain the same as in the 
population. This fact makes 7 much more suitable for the present problem than the Spearman 


coefficient p whose associated scores do not possess the same property. The sample rank 


correlation is then t = c™/n(n—1), 


where c™ = X%c;, and &™ denotes summation only over those values of ¢ and j occurring 
in the sample. 


8. It has already been proved that H(t) = 7. To find the variance of ¢ we require E(é), 


so consider +» [ any at IMe;; Cys 
n n 


= denoting summation over all selections of the sample of n from the finite parent population 
n 


of N members. Let us enumerate the number of ways in which ¢,;c,, and similar products 
with ‘tied’ suffixes, such as c,;c,,, occur in the sum. 


Biometrika 34 14 














200 Significance of rank correlations where parental correlation exists 


(i) When i, j, k,l are all different the term c;;c,, may occur with ve :) selections of the 


—4 
remaining aon von of the sample and the contribution of such terms to & is ade i) L'ei5 Cy, 
=’ meaning summation over all unequal values of i, j, k, 1 from 1 to N. r 

(ii) The term ¢;;¢,, similarly occurs in at : ) ways and there are four ways of tying one 
suffix, each of which gives the same contribution to = since c,; is symmetrical. The total 


n 
contribution of such terms to is therefore (i ? ) L'e,5Cy. 


n 


(iii) Terms like c,;c;; similarly contribute (a >) X’c,;¢;; to z, and all other terms are 
zero since c;; = 0. Hence 


* [cm]? = = H X’c weutd( 3) CeCe t (0 2) DC 55 C45: 
Expressing the &’’s in terms of the corresponding £’s and dividing out by ( we obtain 


E[c™} = wai (Bey Cy — 4X¢; 56, + 22c,;¢,;) _— oie * (Beyeg— Ze ij C5) — os = Sey Ci, 


where n™ = n(n—1)...(n—1r+1). Since Ze,;¢,; = N(N—1) and Xe,;cg = c2, variance of 
t for given 7 and n is seen to depend on the value of Xc;;c,, = Xc?, where c; = Ee 


Let N become large. The quantities c and &c? are respectively O(N?) and Ow 3), so if we 
introduce 7; = c,/N the value of E(é?) for large N becomes 


(n — 2) (n—3) 4(n — 2) 7? _2 
I n(n —1) “Taeciy tat)’ 





and hence in the limit the variance of ¢ is 


_ 4(n—2) eR ; 
vart = eal) en te The #}. (8-1) 
9. The variance of ¢ satisfies the inequality 
vart<_ (1-1), (9-1) 


whatever the parent ranking. Moreover, though the limit may not be attained in any 
particular parent ranking, reasons are given in the Appendix for expecting that it cannot 
be substantially improved upon. The proof is as follows. 

Reverting to a finite parent population of N members, we first seek a maximum for &c?. 
In terms of the original scores, c;; = a;;b,;. Keeping b;; = + 1, i $j, 6,; = 0, as before, allow 
the a,,’s to assume any values subject to the conditions 


The stationary values of &c? occur when the a,,’s satisfy the equations 


bjs(c; + ¢;) — Aa,; — wb,; = 0, 





he 


Cras 


ne 


tal 


are 


e of 


F we 


8-1) 


(9-1) 


any 
anot 


=c?. 
llow 








H. E. Dantets ann M. G. KenpaLh 201 
which give, on multiplying by 6,; and summing j, 
ein WN —1)-c 
(N—2-—A)~ 
Thus, unless the c,’s are all to be equal, in which case Ec? is a minimum, A and 4 must take 
the values A=N-2, w=cl(N-1), 
and since 2c? -AN(N—1)-—ye = 0, 


it follows that Xc? cannot exceed $N(N — 1)(N —2)+4c?/(N—1). Allowing N to become 
large, this implies 


21?/N < }(1+7°). 
Hence vartT, < $(1—rT*), 
and so from equation (8-1) var t <= (1-—rT?). (9-1) 


10. Assuming that the sample is large enough for the distribution of ¢ to be normal, the 


roots 7,, 7, of the equation 2 
t-r=2 /[Za-m], (10-1) 


tee [Z [QeZ-e) 
° n n 
1.€. T= 


: ; (10-2) 
(+5) 





provide confidence limits to 7 when ¢ is known, x being the standardized normal deviate 
corresponding to a given probability of P %. These confidence limits are of course maxima, 
in the sense that we shall be wrong in at most P % of the cases in asserting 7 to lie between 
the calculated limits. 

In our proof of the tendency of ¢ to normality it will be necessary to neglect terms of order 


n-*, and the sample may have to be rather large for such terms to be small, unless 7 itself 
is small. 


The form of equation (9-1) suggests using 
w = sint 


instead of t. To the same order of approximation we can take w as having a normal dis- 
tribution with mean w = sin-'7 and standard error not exceeding ,/(2/n), which is indepen- 
dent of 7. This form is mors convenient for assigning confidence limits to ¢, and for testing 
the significance of the difference between ¢, and ¢, (whose standard error cannot exceed 
[2(1/n, + 1/n,)]), but we have not been able to discover whether the transformation brings 
the distribution nearer to normality. 


11. We now prove that the distribution of ¢ tends for large n to normality whatever the 
parent ranking, provided that |7| is not near unity. 


Write g;; = ¢,;—c/N® so that. Xg;,;= 0, g,; = gj, and g;, = —c/N* = —(N—1)7/N. The 
rth moment of c™ about its mean value is Z[Xg,;}", so consider 


VV gg = VU IerJuo +++» 
n n 
the summation = being over all possible sample selections. 
n 


14-2 











202 Significance of rank correlations where parental correlation exists 


The argument used by Daniels (1944) to show that in the null case the distribution of rank 
correlation in large samples tends to normality can be applied with little modification to the 
present problem. The proof is therefore sketched here without much detail. 

Two essential conditions to be satisfied are that Xg,; = 0, which is true by definition, and 
294;9:x = O(N*), which is true only if i —7? = O(1), so that the tendency to normality may 
be expected to break down for high correlations. 

The sum *. is evaluated as in §8 by counting the number of ways in which terms like 


n 
9:39129uv ---, and similar terms with tied suffixes, occur. In this way it is expressed as a linear 
combination of 2'g;;9,:9u.»---, ete. Every such 2’ is replaceable by the corresponding & 
together with terms containing more tied suffixes which are of lower order in N since they 
involve fewer summations from 1 to N. 


12. First consider the even moments with r = 2m. Terms containing more than 3m 
different suffixes must vanish, since in such cases it is impossible to avoid at least one g,; 
with two free suffixes, and Xg,; = 0. For the same reason the only non-vanishing terms with 
3m different suffixes are those containing expressions like 


294 IikFtuTwIpqIpr «++ = (ZGis Jez)”: 
and terms with fewer different suffixes are of correspondingly lower order in N. 


With 3m suffixes assigned there are el te a, ways of selecting the remaining n— 3m 
- tS 
members of the sample, and the suffixes can be tied in Tae 
Dividing out by (*) and noting that is ce: ) / (*) ~n>™/N®™ when both N and n are large, 
the contribution of such terms to //,,,, the 2mth moment of c™ about its mean, is found to be 
n™ (2m)! 
Nam ( emy 22953 9in)™ ’ 


ways to give the same result. 


which is of order n®". Moreover, by the same argument, terms with f < 3m different suffixes 
add contributions of order n/ which may be neglected. 


n*™ (2m)! 


Hence Lon™ Nam “mm! 2"(29459:4)™ ’ 


the neglected terms being relatively O(n-"). 


13. For the odd moments let r = 2m+1. Similar considerations show that the non- 
vanishing terms of = cannot have more than 3m + 1 different suffixes, and 4,,,, is therefore 
of order n3™+1, 

Then since c/n? has even moments of unit order and odd moments of order n-+, the odd 
moments may be neglected to that order. We conclude that c™ is distributed normally for 
large n with variance 4n3 

We 29159 = 4n* var, 
and ¢ is similarly normal with variance (4/n) var7;,. 


14. The fact that terms of order n- have to be neglected suggests that the normal approxi- 
mation only holds good for fairly large samples. This is not surprising since one would expect 
skewness to be an important property of the distribution of ¢ when 7 is not zero, if only fo? 
the reason that | ¢ | can never exceed unity. It seems worth while to examine the odd momen 
in more detail. 





-~3m 
sult. 
irge, 


70 be 


fixes 


non- 
efore 


odd 
y for 


roxi- 
cpect 
y fo: 
1en 





H. E. Dantrets anp M. G. KEnDALL 203 


The dominant term of the (2m+1)th moment has 3m+1 different suffixes, which can 
“nmap aaa 29 :5GixGil2GJurJuw)” > 8 UPig Fix Gp(ZIuvIuw)” 


med in (2M + 1)24M—-DOF _ (2m+ 1)! amt8 
Both can be obtained in (2)"-23m—1)! 3\m—i)! 


distinct ways, and there are vi “ate ways of selecting tne sample with 3m + 1 suffixes 


assigned. The (2m + 1)th moment of c™ about its mean is therefore 


n3™+1 (2m +1)! Qm+2 
Femi ~ Namei Bim =ii (29:5 9ieGat 29s Gin Ind (Z9iz9in)” 








ignoring terms of relative order O(n-"). The corresponding moment of ¢ is obtained to the 
same order on dividing by n*"**; it depends only on var? and y,(¢), where 


4.3 7 r 957" ss 
N*# 


N 
where g; = Xg;;. The distribution of ¢ is thus specified to one by its first three moment:s. 
j=1 


f(t) ~ n2Né4 - [2935 IIa + XG; ix In) = 


The moment-generating function of the distribution of ¢ in standard measure is 
Mee) = (1431) HD +O), 


where V1 = Ha(t)/(vart)t = O(n), 
and the frequency distribution of x = (¢ Pri cps t) is* 


-(;_n# a ' 
fia) = (1-715) 5 Tmt + O(n a). (14-1) 


15. The effect of the y, term in modifying the confidence limits based on normal theory 
can be seen in the following way. Let & be the normal deviate whose chance of being exceeded 
is P(&). The chance of x exceeding é is, from (14-1), 


F(é) = Pe) +B (E-)T— ‘Gar 


If X is the correct limit such that F(X) = P(&), it is readily proved by successive approxima- 
tion that the formula y 
X=£+2(@-1) (15-1) 


gives the appropriate value of X to O(n-"). For example, the 5 and 1 % limits are respec- 
tively + 1-96 +0-474y, and + 2-58+0-941y,. 


16. In practice the value of var¢ has to be estimated from the sample, and although its 
standard error can be shown to be O(n-*) by the kind of argument already used, it is not 
known how large the sample has to be before the error in estimating the variance can be 
safely ignored. It is best to use the unbiased formula 


1 2(2n — 3) 

j = 2 ah. Ef ce 
mat Pet erireric 5-5 
(which is easily proved) in calculating vart from the sample, especially if the standard error 
of the mean value of ¢ from a number of small samples is required. 





—2n n(n—)} (16-1) 


* Note that the approximation error in f(z) is relatively O(m—'), a stronger result than would be 
obtained from a Grarm-Charlier approximation based on the first three moments only. 











204 Significance of rank correlations where parental correlation exists 
As the term in y, is a small correction it is perhaps sufficient in moderate samples to take 





2 2 
G = $294;(9, +95)" = $2¢;,,(c,+¢,;)?- — + 5 ’ (16-2) 
and walt) =, Y1= malt)/(vart)?, (16-3) 


where the first term in @ is the sum of ¢,,(c; + c,)? over all values of ¢ >j. The unbiased formula- 


for 1,(t) involves some rather tedious computation. 


17. To illustrate the methods of the paper we consider an actual example. 
A set of thirty wool samples were visually graded in order of fibre fineness by three assessors. 
The mean fibre diameter for each wool sample was also determined by direct measurement. 


Table 1 shows the measured order (M) compared with that of the three assessors (A, B, C), 
in ascending order of experience. 











~ Table 1 
M A B Cc - M A B Cc 
1 5 2 1 16 12 14 16 
2 4 5 2 17 10 18 15 
3 9 6 6 18 30 21 25 
4 3 1 3 19 22 26 24 
5 6 7 4 20 16 22 19 
6 2 4 5 21 21 16 18 
7 15 19 10 22 29 20 23 
8 18 3 12 23 28 25 22 
9 8 8 7 24 19 27 26 
10 11 9 8 25 23 28 21 
11 17 13 9 26 20 23 27 
12 13 10 11 27 7 24 20 
13 24 17 17 28 26 29 28 
14 14 12 14 29 27 15 30 
15 1 11 13 30 25 30 29 





























The method of working will be seen from the c;; matrix for the MA correlation shown 
in Table 2. 


The correlations of the assessors’ orders with the measured order are found to be 
t,= 0-490, tg = 0-724, t, = 0-816. 
(i) Consider first the maximum confidence limits given by (10-2). The 5 % limits are 
— 0-02 <t,<0-80, 0-23<t,<0-92, 0-:34<it,<0-96. 
Again, using the transformation w = sin-'t, the 5 % limits are 
0-01 <t,<0°85, 0:30<t,<0-97, 0-45<t,<0-99. 

The values of w are w,= 0-512, wzg= 0-810, we = 0-954. 
The greatest difference is 0-442, and the upper limit to its standard error is ./(4/n) = 0-365, 
so on these grounds the difference between A and C would not be judged significant. 


The 5 % limits are very wide, and the lack of significance is disappointing since (' was 


known to be an expert appraiser while A is relatively inexperienced, and one would have 
expected an obvious difference between them. 





EEE EEE Eee eteete i teteeeeeei+¢iris 


ake 
5-2) 


5-3) 


jula- 


ent. 





own 





HERE EEEE HEHEHE HEHEHE te eteeeeei¢irtios 


H. E. DANIELS AND M. G. KENDALL 205 


(ii) The variances estimated from the unbiased formula (16-1) are 
vart, = 0-006630, vart, = 0-005067, vart, = 0-002198. 
The estimated standard errors are therefore 
&8,=0-081, 8s,=0-071, 8. = 0-047. 
The 5 % confidence limits, assuming normality, are 


0°33 <t,<0-65, 0:58<it,<0-86, 0-72<i,<091. 


Table 2 

Cis cs 

- t+ -— t+ - 4+ 4+ 4+ + 4 4+ ¢.- + $4 + + 444 © 4 FS tS 21 
0+ - + ++eetteeee—-teteeeereteretrtte+etet t+ 21 
+O--—-+ t-te teeeeer—ttettereeeeteet- ttt 17 
—~=@ +--+ +teeettteettteteteetttr + ++ + 19 
+--+ O0- + t+ tetetetee—ttttrt +i ++ +t+4 + + Ft 23 
-e-=- O80 + t+4+44¢44¢4=-44%4%449% ££ 4+ + + 7 + 4 2 Os 17 
+++ t+ *+ Otter —- 4+ -~ $= | |= = $+ H+ + HH + H+ 9S 13 
++ $4 + Oe - = | $e Se He + Fe — HH HS + H+ = FSS 9 
+— £44 =< = 6.4.4.4 + + =. 4+ 4 4-4 * + O48 4 +29 2S 19 
+ $+.%.4:% = =.4 © 44 + + — + =—.4 4 8 4.4 94 9 t= 7 ee 19 
+++ +++ —-4+4+0-4- --t++-—-t+t+4+¢+¢¢ -—-+ 44 13 
t+# ett —-4 + — 044 -—-=—- £8 tt Ft +4 + - Ses 15 
t++eerteeeee te 0 -- - +o == + $= = - = £44 7 
++etee--—-t+4+—-4+-0---—-+ ++ ¢+¢¢¢4¢+ 4+ -t++ 4+ 13 
—~ errr rer rere Kr er K- HK Oteeerererer+r eee ete t+ 1 
++ ett+eeo-tteen-—-- - Ft O0-+tttette+eert+ = + ++ 13 
++ete te err trennrn-- - €- CF ett tr+ee +44 +s 1l 
+ $59 0% +4 3°42 4 4.2 4 252 6 = SS SS SS ee 5 
++ tt ett + t+ + + = © + #4 -— © = OF 2 SS SS 15 
++ Ft te -F & = $= £ 4+ 4 + =| 4 tS 2 2 TSS 17 
+t? Pte + + tee —-— +e eS | £944 - F494 Fes 19 
3+ +t 44+ 4+ + € + 44 4+ * 4 + + 4 > Oe S23 so il 
+ + + :% $4 +t & 6.4 4.49% = 4.42 2 3 - SSS 11 
++ 4+ $4 + + +4 + =F + 9°4. - - © RP oe | 9 TOS SS 15 
++eteeteeetetetete-tetet—-+ttt -4# @6€-+=-¢%4 4 17 
++etteteteeteree¢—-F7+7+4+-- + + +—- 0 —- +44 15 
+--+ ¢ he ee wee ee ee ee = = = eS = = + C4 Se = 
++Ft + +ttetee $4 4+ 4+ 44 + —- + +H — =. > + +S 8 S= 21 
++teeeeeeteeeeteeee—etetee—-ttettete B= 21 
++ eee tee eit eet te ee $$ —- $+ Se = 4+ 4 + |S SB 19 
c= 426 

n= 30 

Xc? = 7470 


Moreover, we should judge A and C to be significantly different at the 1% level, and A and B 
at the 5 % level. How far these conclusions are valid depends, of course, on the accuracy of 
the variance estimates, but the conclusions seem to agree with what might have been 
expected from prior knowledge of the assessors’ capabilities. 

(iii) The values of y, calculated from (16-2) and (16-3) are 


Vi(A) = —0;32,  y,(B) = — 0-35, y,(C) = — 0-38. 


The distributions would not appear to be very skew, and the distribution of the difference 
of two ¢’s is probably nearly normal. The adjusted 5 % limits are, from (15-1), 


0:32 <t,<0-64, 0:57<tg<0:85, 0-72<tg<0-90. - 








206 Significance of rank correlations where parental correlation exists 


APPENDIX 
1, The question arises whether a particular parental form exists for which the variance 
of ¢ assumes the upper limit 2(1 —7?)/n. We surmise, though we cannot prove, that the maxi- 
mum possible variance is attained when the parent ranking has a ‘canonical’ form obtained 
in the following way. Consider again the ranking 
ak > Be Be Noe er a ee 


The number of positive pairs S is 26, so that ¢ = 0-44. -Let us transform this so as to bring 
the 1 to the beginning of the ranking but move the 9 so as to preserve the number S at 26. 
The 1 passes over three members to go to the beginning and hence adds 3 to the score. The 
9 must, therefore, proceed to the left over three numbers so as to subtract 3 from the score 
and we reach Eee ee eae a * 
Now operate similarly with 2 and 9, reaching 

rr ae Be Do ee A ee ine 


Had our 9 been contiguous to the 1 and incapable of moving farther to the left we should 


have moved the 8 and so on. Proceeding with the process by moving back the 3 and the 9 
and 8 we reach [ces Se €- 3.9 4, 


and again . = eS. S. V2 S. 


All the lower numbers 1 to 4 are in the right order and the remainder are in the inverse order. 
We call this ranking the ‘canonical’ order for given S (or #). It is not always possible to 


reduce a given ranking to canonical order, .but there cannot be more than one individual 
out of place. 


2. Consider the effect of a series of transformations leading to the canonical form. The first 
process, that of moving 1 and 9, will increase the value of S for some samples involving 1 but 
not 9 (leaving the others unchanged), will decrease the value of S for some samples involving 
9 but not 1 (leaving the others unchanged), and will, in general, not alter those involving both 
1 and 9. Similarly for 2 and-8, and so on. The effect of the transformation is thus to increase 
the values of S containing the lower numbers 1, 2, 3, etc., and to decrease those containing 
9, 8, 7, etc. These values of S are themselves, in the canonical form, the greatest or least as 
the case may be. Consequently the progress to the canonical form is accompanied by 
increases in the number of high values of S and increases in the number of lower values, and 
one might expect the spread of the distribution to tend to a maximum. In the example 
quoted, the distributions of S in samples of 3 for the successive rankings are: 

















Values of S Frequencies f 
0 2 3 3 6 10 
1 15 13 16 10 — 
2 34 35 29 32 40 
3 33 33 36 36 34 
Totals 84 84 84 84 84 




















The sums =fS are all equal to 182. The sums LfS? are respectively 448, 450, 456, 462 and 466, 
showing the canonical ranking to have the largest variance of the five. 





whi 








H. E. DANTELS AND M. G. KenpaLi 207 


3. There is, however, another way of carrying out this process. If the parent ranking is 
inverted, 7 becomes —7, but the variance of samples of n drawn from the inverted ranking 
remains the same, by symmetry. We may then reduce the inverted ranking to its canonical 
form and reinvert it so that its coefficient is aguin 7. This ranking we call the inverse canonical 
form. It will be shown that for large N, when 7 > 0 the inverse canonical form yields a larger 
variance for ¢ than the direct canonical form. 

Even in the example already quoted, the inverse canonical ranking (with one member 
out of place) is 2.4.48 42%..4:8-4 
which has a distribution 











Values of S f 
0 2 

1 27 

2 10 

3 45 
Total 84 














The sum LS? is now 472, which is greater than the previous maximum 466. 


4. Consider the canonical case when there are N members altogether, R at the beginning 
in the right order, and N — R in the inverse order. If we select n—j members from the R 
and j from the N — R the value of S for the sample of n is 4n(n — 1) — $9(j — 1), and the relative 
frequency of U = 4n(n—1)—S is ia oe / (") . Now suppose that N tends to in- 
finity and R/N to the ratio py. The relative frequency of U = }j(j—1) tends in the limit to 


- n—j gi 
(5) - 


where g = 1—p. The mean value of U is then 


EhG-0) (‘) p-igi = Jn(n—1)q4, 


and since t=1- man" 

we must have q = {}(1—7)}*. (4-1 A) 
The variance of U is var U = n(n—1) pq*{ng— }(1 — 3q)}, (4-2 A) 
and so vart = 16pq*{ng— }(1 — 3q)}/n(n— 1). (4-3 A) 


5. If now the inverted parent ranking is reduced to canonical form, giving ratios p’, q’ 
corresponding to p and q, we shall have 





q = V[d(1+7)] (5-1 A) 

and vart’ = 16p'q"*{nq’ — (1 — 3q’)}/n(n— 1). (5-2 A) 
Then since g*+q’? = 1, 

vart’—vart = wenip ¢- (1-4) (1-4). (5-3 A) 


When 7 is positive, g’ >q and vart’ exceeds vart. 











208 Significance of rank correlations where parental correlation exists 


This result suggests that the maximum variance may be attained by the inverse canonical 
ranking when 7>0 and by the direct canonical ranking when 7<0. With this choice of 
parent ranking the variance of ¢ for large n is 

4,/2 
vart~ #27 (1+ |r) (1— H+ | DD. (5-4) 


n 


It is interesting to compare (5-4A) with our upper limit of 2(1—7*)/n. Their ratio is 
{2(1+|7|)}#/[1 +{4(1+|7|)}#], which varies from 2(,/2—1) = 0-83 when 7 = 0 to 1 when 


7 = 1. Evidently the upper limit to the variance cannot be much improved, since an actual - 


ranking has been found whose variance approximates to it for all values of 7, when n is not 
too small. 


REFERENCES 


DantEts, H. E. (1944). The relation between measures of correlation in the universe of sample per- 
mutations. Biometrika, 33, 129. 


Kenpat., M. G. (1943). The Advanced Theory of Statistics, 1. 2nd ed. 1945. London: Charles Griffin 
and Co. 








[ 209 ] 


TESTING FOR NORMALITY 
By R. C. GEARY, Cambridge University Department of Applied Economics 


1. IntRODUCTION 


The present communication, one of a series, has two main objectives: 

(1) To show that probabilities derived from the well-known analyses of variance and 
other ‘small sample’ tables, which postulate universal normality, may differ seriously from 
the true probabilities when the universes are non-normal, even, in some cases, when the 
degree of non-normality is not considerable. 

(2) To determine the most efficient tests of normality from a wide field of alternative 
symmetrical tests. 

It may be useful to sumimarize very briefly previous work in so far as it is strictly relevant 
to this study.* The modern theory may be regarded as having been initiated by Karl Pearson 
who, in 1895, found the first approximation (i.e. to n+) to the variances and covariance of 
6, and 6, for samples drawn at random from any universe and, assuming that the ./b, and 
b, were distributed jointly with normal probability, constructed ‘probability ellipses’ from . 
which the probability of the same values occurring; had the universe, in fact, been normal, 
could be inferred very approximately. A considerable advance in moment determination 
was made by C. C. Craig (1928). In 1929, R. A. Fisher, in inventing cumulants, simple func- 
tions of the sample moments, and formulating rules for finding their semi-invariants, 
developed incidentally a technique for expanding to several terms in 1/n the moments of 
,/6, and 6, when the universe was normal. This paper was followed soon after by another 
(1930), fundamental for all succeeding work on this subject, in which R. A. Fisher ingeniously 
applied combinatorial technique to the finding of exact values of the moments of normal 
Jb, and b,, and gave inter alia the values of the second, fourth and sixth moments of ./b, 
and of the first three moments of b,. The fourth semi-invariant, together with many other 
normal semi-invariants of 6,, was determined by J. Wishart in 1930, and a further advance 
in R. A. Fisher’s technique was made jointly by R. A. Fisher & J. Wishart in 1930. In 
1932 Joseph Pepper gave the eighth normal moment of ./b,. Using R. A. Fisher’s rules 
C. T. Hsu and D. N. Lawley in 1940 gave the exact values for normal random samples of 
the fifth and sixth moments of b,. Using a method due to R. C. Geary (1933) (applying 
C. C. Craig’s ideas (1928) to the normal problem), R. C. Geary & J. P. G. Worlledge have 
recently (1946) found the seventh moment of b,. 

So much for moment determination. In 1930, E.S. Pearson used appropriate Pearson-type 
curves, applied to R. A. Fisher’s (1929) approximations of the semi-invariants, to find 
approximate frequency distributions of ./b, and b,. From the frequency distributions he 
computed a table of 1 % and 5 % probability points at intervals for n from 50 to 5000 for /b, 
and for n from 100 to 5000 for 54. 

Since at the time the prospect seemed remote of determining the frequency of normal b, 
on which reliance could be reposed for samples of moderate sizes, R. C. Geary (1935)t 
suggested that the ratio, a, of mean deviation to standard deviation computed from the origin 


* An excellent account of the development of moment theory up to the year 1930 was given by 
J. Wishart (1930). 


+ The author was informed by M. Fréchet that this test was suggested by Bertrand, but has been 
unable to check the reference. 











210 Testing for normality 


might be used as a test of normality, and gave the 1 and 5 % probability points for this test 
at intervals for normal samples of 6-100. E. S. Pearson compared experimentally Geary’s 
test with b, and suggested, for samples so large that comparison could safely be made, that 
6, was probably somewhat more sensitive than a, a suggestion which will be examined 
theoretically in this communication. In 1935 also, R. C. Geary showed that there was a 
high (negative) correlation for normal samples between a(1) (see 3-1) and b, for normal 
samples, and argued therefrom that the former should be nearly as efficient as b,. In 1936, 
R. C. Geary gave a table of 1, 5 and 10 % probability points of a(1) at intervals for samples 
of 11-1001. In 1938, ® brochure by R. C. Geary & E. S. Pearson was published by the 
Biometrika Office entitled Tests of Normality, giving tables and diagrams of probability 
points of a(1),/b, and b,. There is considerable literature dealing with the effect of universal 
non-normality on the normal tests, mostly by way of particular numerical examples: a selec- 
tion of papers on this subject is included in the list of references-at the end of the paper. 


2. EFFECT OF NON-NORMALITY 
(a) The z-test 
The effect of universal non-normality will first be considered in relation to the z-test. 


If 21, %g, ..., Xp,» ANA Y;, Yo, ---) Yn” are two independent samples drawn at random from the 
same universe (normal or non-normal) it is easy to show that, if 





1 2s (42%)? ’ 
z= blog”, —— = Plog’, (2-1) 
"26-7" 
then of = At (5 +3) = M,, (2-2) 


when both n’ and n” are so large that terms in n’ and n” of degree less than — 1 are regarded 
as negligible. This is an obvious generalization of the approximate formula given by 
R. A. Fisher* for normal samples, namely, 


i .-2 
Pe. cP . 
of = (ata) M3. (2-3) 
It may be useful also to give formulae for the first and second moments from zero for z 


when the two random samples are drawn not necessarily from the same universes, though 
both universes have mean zero and the same variance A,: 


» 1 /% aN 1 (Ag, 2Aa3 1A AG Ny A ) 
— ary +7 a aor aa 3 +a (Ga “s)+ 12A,( 74 - 4) 


ie 4 . 1 (A, + 203)? ite 
a(S) + 9M Sa— a) |-aa| oo ha 











4M; = a (+33) + 2ai(= 9 ‘ei)] ad 


—al (or a =) sees) +e) 


- = 33 xr + 203)2— —_ 6 UN, + 208) (AT +208 » | +... 











malar ry (Ag + 2A$)? 


* Statistical Methods for Research Workers, 8th ed. p. 219. 





1e 


h 


y 


4) 





R. C. GEARY 211 


where the A’s indicate semi-invariants of the two universes of the orders indicated. In these 
formulae, in effect, terms to order — 2 in’, n” are retained. 

When both samples are large the frequency distribution of z will approach normality 
provided that ,, is finite. The effect of universal kurtosis can accordingly be assessed in a 
very rudimentary manner from (2-2) and (2-3). The z-deviate ¢ corresponding to, say, the 
24 % normal probability point is ¢ = 1-9600 JM. (2-5) 


If, however, the universe were not normal and had, in fact, a variance M, with £,+3, the 
actual probability of-a deviation in excess of { in absolute value would be, not 0-05, but the 
normal probability appropriate to a unit variance deviate of {M z+. On this consideration 
the actual probabilities for different values of £,, where the assumed probability is 0-05, are 
shown in the fifth column of Table 1. 


Table 1. Effect on probability of z of change in universal kurtosis, for large samples 








° ° 3 ° Actual 
Ba M3/M, VAM3/Ms) — | 19600 /(MS/Ms)| —robability 
1-5 4 2 3-9200 0-000089 
2 2 1-4142 2-7718 0-0056 
2-5 1-3333 1-1547 2-2632 0-024 
3 1 ; 1 1-9600 0-050 
3°5 0-8000 0-8944 i:7530 0-080 
4 0-6667 0-8165 1-6003 0-110 
4:5 0-5714 0-7559 1-4816 0-138 
5 0-5000 0-7071 1-3859 0-166 
5-5 0-4444 0-6667 1-3065 0-191 
6 0-4000 0-6325 1-2397 0-215 























The table shows that, if the universe from which the samples are drawn has f, = 6, the 
true probability is about 1 in 5 instead of the assumed | in 20. It is, of course, true that 
universes with so large a kurtosis are unusual. This view cannot be held of the range 2-5-4 
for £, in which the probability, assumed to be 0-05, can be anything, in fact, from 0-024 
to 0-110. Accordingly, if universal kurtosis is markedly negative, use of the standard table 
masks significant differences; if kurtosis is positive the standard table exaggerates these 
differences. Unless systematic tests have estabiished that kurtosis is negligible the standard 
table should not be used for testing significant differences in variance. 

The foregoing analysis gives a theoretical explanation of the striking experimental results 
of E. 8. Pearson (19316) working, however, with a test function 


_ E a —ay/{ 5 (a,- 24+ 5 wr 
i=1 i=1 i=l] 


and with sample sizes n’ = 5 and n” = 20, smaller than those contemplated in the present 
analysis. With 500 samples Pearson showed that when the frequency at the two tails 
together expected from normal theory was 15-4 (=probability 0-0308) the frequencies 
actually found in symmetrical universes with £, = 2-5, 4-1 and 7-1 respectively were 7, 39 
and 47, equivalent to probabilities of 0-014. 0-078 and 0-094. 











212 Testing for normality 


If tests of normality indicate universal kurtosis, either of two courses might be adopted: 

(i) Assume that z is normally distributed with variance M, computed from (2-2) with 

(£,—3) estimated as k,/k3 from the sample, k, and k, being R. A. Fisher’s (1929) cumulant 
functions. 

(ii) Enter the standard table, not with z computed from the samples but with z,/($/M@,), 
estimating WM, as in (i). 

Both of these procedures are, of course, open to the objection that, unless the samples 
are extremely large the estimate of £, is unlikely to be accurate; the real 8, might be larger 
or smaller than the estimate. Any probabilistic inferences should accordingly be accepted 
with reserve. 

It is fortunate that the condition specified in the foregoing paragraphs, namely, that the 
numbers in the two samples are both large, rarely applies in practical applications. It more 
usually happens that the number of classes is small, whereas the number per class is relatively 
large. In this case E.8. Pearson (1931 5) has shown the first approximation too? is independent 
of £,, from which he inferred that the actual probability when the total number of samples 
was large was inconsiderably influenced by kurtosis. ..a view of the foregoing analysis it 
seemed to the writer desirable to carry the inquiry a stage further. 

Suppose, then, that k samples are drawn at random from the same universe, n; in the jth 
sample, the total ~ n; = n. It is assumed that n is so large that terms in n-* are negligible, 
that the number of samples k is small, and that all the n, are of the same order of magnitude 
as n, i.e. that if k 
Nn; = 17;N, 2, nm, =1, (2-6) 
none of the 7; is negligibly small. 


Using R. A. Fisher’s cumulant notation with subscript to indicate the sample from which 


the cumulants were computed, the mean for the jth sample is written k,; and its variance 
k,;. Then 


z= plog’, (2-7) 
where (k-1)X = Em (leas — bs)® = Enjkiy—— EP aykyy 
so that z= X = Y1,(1—7;)k,—2 a 1; Ty ky kyy, 
and (n—k)Y = 2 (m5 — 1) key, 
so that Y = 24;k,;, 
where ¢; = a ‘ 





Without loss of generality let the universal mean be zero and the variance unity. It may 
easily be shown that 


EX = EY =1. 
X X+(xX-zFs 
Set w= P= Peay HK -. 
Then w = {1+(X—1)}{1-—(Y-—1)+(Y—-1)*-(Y—-1)5+...}, - 
hactieaail allel dram eccenomen ie (2'8) 





Wes 


syml 
and « 


with 


Th 


Th 


ay 





R. C. Geary 213 


We shall compute the approximate values of Zw and Ew’, i.e. the-values to order n-1; the 
symbol = denotes ‘equal to, to approximation required’. From values of the variances 
and covariances given by E. S. Pearson (19316) in his eet (9)-{11), we have 


bpeehee } (2-9) 





E(Y- 1patet? 
n 
with a, = > 75. 
j 
= 
We require (==*)' x+= 5 nf(1—n,)*4y—45 Enfl —m)) my iykay 
j 7 ¢. 


Y—1 = 24,(ka;—1) = 26;kj;, say, 
remembering that, by definition of cumulants, 


Ek,; = A, = 1. 
Also (Y¥—1)? = D¢jk, Ft 2D Did, kaykay- 
It will be useful for what follows to note that 
bj=T}. 


Using R. A. Fisher’s formulae (1929) for formation of joint semi-invariants of k, and k,, 
and noting that the k samples are independent, we find from the foregoing 
n(k—1) EX(Y —1)?=(k—1) (q+ 2), 
n(k— 1)? BX%(¥ —1)=2(k®—-1)A,, (2-10) 
n(k— 1)? HX* Y — 1)?=(k*— 1) (A, + 2). 
Then, from (2-8), (2-9), (2°10), 


Ew=1+2, | 
(2-11) 


Buta ttt y 


ER men OH 1)- (k2 + 2k- 2-as)Ad} 


These are the formulae required. It will be noted 


(i) that the terms free of n—! are independent of A,, which is equivalent to E. 8. Pearson’s 
result (19315); 


(ii) that the formulae (2-11) agree with the normal values 


2-12 
gar Btdle 2.92 28 Vas 148) ve 
a Ea ( ~n—k | ~n—k “Fi ( n}’ 


to n-! when A, = 0; 
(iii) the approximations at (2-11) are free of Ag. 








214 Testing for normality 


The approximations at (2-11) tend to confirm E. 8. Pearson’s result that, when n is large 
compared with k, the effect of universal kurtosis is unimportant. It would be useful, however, 
to compute the approximate true probability for different values of k,n, A, and a_,. For this 
and for subsequent work the following lemma* will be found useful: 


If f(x) and $(x) are two frequency densities with semi-invariants L,, and Li, (m = 1, 2,...), 
respectively, then, formally, 


fie) = exp| 5 Ga—2e)(_£)"\ (0, (2-13) 


m=1 m! 

For the present application take as generating function ¢ the frequency distribution of w 
in the normal case, i.e. (" = 5), (3 

2 n—k He-3) 














" (k—1)w\-# 
$(w) = 7-5 (n= k—2 w (1+ ao | ; (2-14) 
a) ) 
and, from (2-11), L,— L,=—- = aE oo a, (2-15) 
Assume that Lyz—L,=0 (m+2). 


Then if the ‘normal theory’ probability corresponding to the sample value w be p, the 
approximate ‘true’ probability, subject to (2-15), will be about (p+ p’), where p’ is given by 


pt = as Ae (* gr) dw = S22) gr), (2-16) 


The term p’, of course, merely corrects for the non-normal term in n- in the variance of z; 
it takes no account of corrections due to terms of higher (negative) orders in n or even of 
non-normal terms in n-" in semi-invariants L,, (m> 2). The calculation is designed merely 
to show whether the standard table probability requires correction for universal kurtosis; 
this will appear if p’ is of the order of magnitude of p. 


(b) The t-test 
In Geary’s 1936 paper the expansion to terms in n-* of the first fowr moments of t, wheve 
t = ntk,/ki, (2-17) 


were given. Following are the first siz semi-invariants L of ¢ to the same approximation as 
in the earlier paper: 


1 
kun -a {3 +o (22g— 205+ BAsAy) + 
L,=1 + }(8+ 7A3) n-1 + (6 — 2A, — $A3 — 48A,A, + 432ARA,) n-, 
L,= — 2A,n-* — (9A, — 3A, + 4BASA, + 8323) nt, 
tL, (6— 2A, + 1202) n-) + (54 — 18A, + 4A, + 7503 — 63A,A, — GAZ + SIAZA, + 28228) n-2, 
L,= —(60A,— 6A, — 20A,A, + 105A3) n-#, 
L,= (240 — 120A, +. 5774A2 + 16A, — 210A,A, — 150A, + 1200A8) n-2. 
Throughout this subsection we take An = X,,/azim, 


} (2-18) 





* Due to Charlier and termed the “‘ Differential Series” by the Scandinavian School. 
+ 1936 formula corrected. 








It i 


the 


It « 


Ww 





18) 





R. C. Geary 215 


where the X/,, are the semi-invariants of the parent universe. For these expressions terms in 
n~* are neglected. They were derived from the moments (from zero) Mj of t, which were 
obtained by the method described in the 1936 paper. It will be noted that, to the approxi- 
mation used, the expressions involve only the first six semi-invariants of the parent universe. 
When the parent universe is normal all the A, (i > 2) are zero. The magnitude of the numerical 
coefficients in the foregoing approximate expressions for the L; indicate that, when the 
universal values of the A,, particularly those of uneven order, are not very small, the frequency 
distribution of ¢ may differ appreciably from the classical Gosset-Fisher (1908, 1925) dis- 
tribution. 

The formal Gram-Charlier expression for the frequency of ¢ could, of course, be written 
down at once from (2-18). It is doubtful, however, if the Gaussian can be regarded as the 
most appropriate generating function for the frequency of ¢ because, even when the parent 


universe is normal, the semi-invariants 7';,, of the higher even orders are large for moderate 
values of n. For example, 


LiJL;? = 6|(n—5);  Lj/L;3 = 240(n—5) (n—7). 
It is proposed to use (2-13) for finding the approximate frequency with 


wy = Tem) = ("S)r(14+ )"/C Seay, (2-19) 


n—1 





the Gosset-Fisher frequency. Let 
i? 


T(t; n) = (: ~ 4)" (2-20) 


It can easily be shown that the rth derivative (in ¢) of 7, is 





TYXt; n) = (-y ee r- OD pn CIE 2) = 9) pa 





(n—1)!(n—-1 2.4 
r(r — 1) (x — 2) (r—3) (r—4) (r—5) {2 \—H+2n 
with n, = a-] ee (n—1)8 





“antl? "> (@+l(nts)’ ™s = (n+) (n+38)(n+5)’ = 


Note that (2-21) assumes the Hermite form when n = oo. 

The theory wili now be applied to particular examples using in all cases n = 10. The 
universes will be assumed to belong to the Karl Pearson system, so that (M. G. Kendall, 
1941) the values of A; and A, can be derived (given A, and A,) from the following equations: 


(1+ 79) Ag + 5EA, + 109(4A, + 3A2) = 0. 


From the first two equations 
n = (2A4—3A3)/(— 10A, + 12A3— 12), 


which, substituted in the first equation of (2-22), gives £. The values of £ and 7, substituted 
in the third and fourth equations, give A, and A,. From (2-18), the Z; being the semi-invariants 


Riometrika 34 15 





216 Testing for normality 


when the parent universe is normal (i.e. the values found when all the A’s are set equal to 


zero), L,-Lizdn44+K yr, L,-Li=Jn24+-K,n, 
L,—Lyj=J,n1+K,n, L,—L5= K,n-4, (2-23) 
L,-Li=Jn++K,n4, L,—Llix K,n-*. 


The J and K are the terms in the A in (2-18). To n-* (i.e. ignoring n—) the frequency generated 
from T' of (2°19) is as follows: 


sh ach ed Me hae ” (e+ 4h) +2, JB ni 


(K,+ 5J,J,+ 10J,J,+ 102 J,) 


+n i. D+ (Ky+ Bd, ++ 


he + 23, J3)+ a Ji casa K,)+> g (Kat 4h Ks 


iG 1296 


+43, K,+3J3+ 6J2J,+ Jf 42 (K,+6J,K,+20J,K,+ 15J,J, 


720 


+ 600), Jy Jy + 15I3J, + 20J3Jg) + Tam (Ob K s+ 10.3 + 80J,J2 


+ 80J,JyJ, + 8053I3) + D— (BURJ, + 4, J3) + J 


12 e 
5184 31,104? J = 


h 
with p=(-5) 7. 


To n-, (2-24) agrees with the formula given by M. S. Bartlett (1935), in which, however, 
there is a small and obvious slip in a sign. The law of formation of the numerical coefficients 
of (2-24) is evident; for instance, the numerical coefficient of D*J, J? is 1/144 = 1/2! 3!2!. 


ro) —t 
The integrals | and | (¢> 0) are found by reducing the exponent of D by unity, as follows: 
t -o 


-t Cs) at ro} -t 
| Dd = -T, ( D™ dt = | D™@ dt = D*', | Dm dt = -| D1 = D™. 
-@ e -o t -@ 
(2-25) 


In normai theory the upper and lower 2} % points of ¢ ave + 2-262 for n = 10. Table 2 
shows the ‘true’ probabilities, i.e. the value of 


— 2-262 
i) f(t)dt (2-26) 
for parent universes specified by A;, Ay, using (2°24). 

There are two observations to be made on the results presented in this table. The first is 
that, despite the considerable number of terms (shown at (2-24)) included in the probability 


expansion, the values found in the successive terms cannot be regarded as satisfactorily — 


convergent for so small a sample as 10, and, of course, the convergence disimproves with 
increasing ,/f,. Taken all together, however, they seem consistent and significant. The second 
observation is that attention was confined to the negative ‘tail’ of the distribution. It may 
be assumed that, in all cases, the distortion would be very considerably less marked if 
regard were had to the probability for | ¢| > 2-262. Actually for universe 3 the probability 








The t 
range 


for the 
and n’ 
The se 
varian 
from u 
necess: 
equal, 

seems 

being « 
from u 
The cc 
proble: 
ferred 


prime | 
the ari 








R. C. Geary 217 


is 0-056, not significantly different from the normal theory probability of 0-05. In justifica- 
tion of the attitude adopted above, the point might be put as follows: 
We decide to accept the hypothesis that the universal mean is zero provided that the value 
of ¢ found from the particular sample satisfies t, <t <t,, where 
Prob (t <t)) = Prob (¢>t,) = 0-025. 


The table is designed to show that if the parent universe is markedly asymmetrical the 
range (f),¢,) may differ appreciably from —t, = ¢, = 2-262. 


Table 2. Probabilities of t less than — 2-262 for samples of 10 for seven universes 








Universe As=h; A,=f,-—3 Probability 
Normal 0 0 0-025 
2 0 1 0-024 
3 1/2 0 0-041 
4 1/J2 1/2 0-047 
5 1 0 0-072? 
6 1 1 0-086? 
7 1/2 1/2 0-043 




















As anticipated by earlier work (W..S. Gosset, 1908; R. C. Geary, 1936), the table shows 
that the distortion is slight for symmetrical universes; even when A, = 1 (and A, = 0) the 
probability (0-024) is practically identical with the normal value. There can be little doubt 
that the standard table probabilities can be seriously at variance with the true probabilities 
when the universes from which the samples are drawn are markedly asymmetrical. 


(c) Difference of means 
R. A. Fisher’s (1925) test of significance 


_ (ki — i) J(n’ +n" — 2) n'n” 
~ {(n'—1) kt (n"—1) eta (n’+")’ 





(2-27) 


for the difference of averages k, and kj in normal theory for random samples numbering n’ 
and n” is, of course, a particular case of the analysis of variance considered in § (a) above. 
The second cumulants are k, and kj. It is assumed that the unknown universal means and 
variances are equal. Suppose now that the random samples in reality have been derived 
from universes in which the means are equal but the other semi-invariants Aj and Aj are not 
necessarily zero for 1 > 2, or even necessarily equal. Since the universal means «re assumed 
equal, without loss of generality we may take A; = Aj = 0. This general mathematical model 
seems to be the correct one; we are not trying to determine the probability of the samples 
being derived from the same universe but rather if they could conceivably have been drawn 
from universes with the same arithmetic mean, however ‘much they may differ otherwise. 
The correctness or otherwise of the concept may be considered in relation to, say, the 
problem of deciding from two random samples which of two types of fertilizer is to be pre- 
ferred from yield observations on a given crop on a given kind of land. Undoubtedly the 
prime problem will be that of ascertaining which is probably the better yielding (i.e. whether 
the arithmetic means are significantly different). Of considerably less importance is the 


15-2 














218 Testing for normality 


question of which fertilizer is the more variable; of less importance still is the question of 
deciding, say, whether with approximately equal yields one universe is symmetrical and 
the other markedly asymmetrical. The point is that the question of the equality of universal 
means should be considered without assuming that the other semi-invariants in the universes 
from which the samples have been drawn are necessarily equal. This essentially is also the 
viewpoint in R. A. Fisher’s randomization method. 

Expanding the denominator of (2-27) in terms of (k,—A ) and (k,—A3) and computing 
therefrom the first few terms of the first four moments of t, we find the following approxima- 
tions to the first four semi-invariants: 


_. ___(As—As) 
aiy= 2(n’A,+n"A3)’ 
ae Qn’ Ai? + n"Az? 
2 o2 beific Mitac | 
atlanta) (14 aacare ) 


(n’2—n") (AQAZ—AGAZ) | _7(A5—A5)? 
nn'(n'rA,+n'Aa)® | (nA, +n"Ag)®’ 


MNS MARAE) [Ak 2) 
OF. ey ab ee en 3 q 2-28 
Mh, n'? n"2 (n'A,+n"Aj) (3 ro Oey 





n 

Aap, = SABENA) (* a) -o- a) (Aj—A3) 
«= Age nAg)® \n' tn 7a n"8) (n’A, +n" Ag) 

19(A;—A3)? (% *) a 

(WAT RAR? \n’ 0”) WB ns 


-3(234 *3) {Ag(n'n" Ay + 2n”? — 0) +Ag(n'n"AZ + Qn’? — "2A; a)} 




















non n'n"(n'A,+n"Ag)* 4 
’ ” ’ ee ” fil + 
with yee ea\ ele 
n'n (n’ +n" — 2) 


Using formula (2-24) to the term in n— with the Gosset-Fisher function again as generating 
function, Table 3 shows rough approximations, for four examples, to the ‘true’ probability 
of values of t <7, where 7 is the (negative) value for probability 0-025 from the normal table, 
and A, = A, = 1. When the two samples are drawn from different universes the distortion 
can accordingly be considerable. The third example suggests that if the universes are the 
same the distortion is small, a result to be anticipated from the fact (apparent from (2-28)) 
that, to the approximation used, the first two semi-invariants are equal to their normal 
theory values; this theory confirms the experimental results of E.S. Pearson & N. K. Adyan- 
thaya (1929). 








Table 3 
1 
Example n’ n” a3 ag XG AG Probability 
1 12 4 1 -1 1 -1 0-045 
2 18 6 1 -1 1 -1 0-041 
3 7 4 1/,/2 1/./2 1/2 1/2 0-027 
4 10 6 1 0 1 0 0-036 



































ge 
af 





28) 








R. C. Gzary 219 


It should be remarked that the probabilities in Table 3 (as well as in Table 2) are merely 
rough approximations—the samples used are far too small for the results to have any preten- 
sion to accuracy. The object has been merely to show that the actual probability could be 
considerably at variance with that shown in the standard table, for small samples. 


3. SUFFICIENT CONDITIONS FOR APPROACH TO NORMALITY OF a(c) WITH INCREASING n 


The remainder of the paper deals with the field of symmetrical tests of normality, homo- 
geneous of degree zero, represented by (3-1). It is essential to establish the conditions of 
approach to normality of the apg distribution of a(c) as the sample number increases. 


Let a(c) = — > | a, -a(/{- Ss (x; -3", (3-1) 


where % = 2z,/n and c is non-negative. It will be shown in succession that, subject to stated 
conditions, with increasing n, 
{i) the frequency distribution of 


1 a 
a,(c) = am | x;|° aa tt (3-2) 
tends towards normality, and 


(ii) the frequency distribution of a,(c) tends towards that of a(c) and hence towards 
normality. 


It is assumed, without loss of generality, that the universal mean of the universe from 
which the sample of n is drawn is zero. Denote the kth absolute moment from zero by “4, 
k not being necessarily an integer. Given a positive quantity ¢ arbitrarily small, w(¢) can be 


found so that <i 
Prob {| * 542, Ma)| <e [oe Ste, (3:3) 


<o, [=H l—e, (3-4) 


provided, of course, that ,.,, and 4, exist. As n increasesw may be envisaged as approaching 
the normal probability point appropriate to the probability e¢, since, in the conditions stated, 
=| x; |°/n and 2x3/n are normally distributed in the limit. For samples which satisfy the 
inequality in the brackets { } at (3-4) and if n is so large that 


“[()<m 


the denominator of (3-2) can be expanded to three terms (including the remainder) by 
Taylor’s theorem, so that a,(c) may be written 





Prob {| Z(t) 





+ + a 





l 
(Cc) = Aq, a*{1 *. E(u -5 «) _ 532M 22, + (22 on += 2m) x} » (3-5) 


with y; = (| 2; °= Het) /Mrer> 
% = (]—H2)/Ha 


6 —Hc+4) 
X = wy? {ua +5 Ete ny) (0<0@<1). 








220 Testing for normality 
With probability exceeding (1—e) it is evident, from (3-4), that X is maximized by 
0-48) 
It will suffice, for the present purpose, to infer that 
| X|<k, 


where «x is a constant independent of n. We have now 
1 c 
ke 2(u-$4) = 0. 


na ¥; 2% 





1 [jae Chiesa) , © Ma (5 )| 
= — {Fas _ Piers » “Pe (Fa) }, (3-6) 
re Mel, 44g \2 
1 (ud a,(c) 1 ( c ) 
and - (1 ——Zly;—=2,} =u, (3-7) 
o\ ei na \8 3% 
with meper oy Bay SD (Bayt (145 By) X (3-8) 
Onda Yi grag 7 oS 


For samples which satisfy the inequalities in { } at (3-3) and (3-4) and hence with a pro- 
bability exceeding (1 —2e), we have 


Co? M(Miaoilta) , (C+ 2) KW%My (© [pia) — § 
inde a (+e [<j 


(3-9) 
20 Npbiei fle 8a MR fan ” 
where is independent of n. Or, briefly, 
Prob {| «| <)> 1-2 (3-10) 


so that u tends in probability towards zero with 1/n. Now (3-7) may be written in the form 
u = Y’—Y, where Y’ and Y are the respective terms on the left side. If A be any number 
and F the total probability function, a well-known lemma (Fréchet, 1937, p. 164) shows that 


|Fr(A)—Fy(A)|<{Fr(4+4)—Fy (4-4) +26 (3-11) 


using (3-10). Hence the frequency distribution of 


y’ =! (ext a(C) _ 1) (3-12) 

o\ ei 
tends towards that of Y=  : py (u.-§ x) (3-13) 
no 2 


at every continuity point of the latter frequency, as n tends towards infinity. But Y, from 
(3-13), is the simple average of n random measures, and its frequency must tend towards 
normality provided that its standard deviation exists; from (3-6) it is evident that o is 
finite provided that ,,, where k is the greater of 2c and 4, is finite. Here and in the remainder 
of this section it will be useful to remember that if ,,, exists so does yw, for 0<k' <k. 











3-6) 





“a 


R. C. GEary 221 


To prove that the frequency distribution of a(c) tends towards that of a,(c) and hence 
towards normality with increasing n it will be shown that 2 | x;—% |*/n tends in probability 
towards 2 | x; |*/n. Two cases will be considered separately: (1) c> 1, (2) 1>c>0. 


Case (1). c>1 
For values of x; for which | z,|>|Z|, 


|a,—% |°—|a,|° = +c%|x,-02|*" (0<0<1) 


and when | z;|<||, ||.2,—% |°—| a, |° | $(2°+1) | Ze. 
| n n 
Hence ;| S (|x,—#|)°—(| a |) <l8l(5 = jz[2+0]2/-), (3:14) 
N\i=1 M j=1 





B and C being independent of the z; and n but depending on c. With ¢ arbitrarily small 
w can be found so that 


Zi<w [\351— 
Prob {|| w [A\>1 €, exis 
Prob F gf (| #5 |°? — yen) 





<w, [Mines = Hea — 
n 
Hence, from (3-14) and (3-15), if w, and jj,_») exist, 


Prob {| 2|x,—2|° ~=|2,|¢| < wiht 2e 





for » sufficiently large the constant B’ depending on c but not on n. Hence for c21, 
x |x;—-%|*/n tends in probability towards 2|2,|*/n. Incidentally, this proves that 
{X(x; —Z)*/n}* tends in probability towards {22z3/n}**, the latter two expre_sions representing 
respectively the denominators of a(c) and a,(c). 
Case (2). 1>c>0 

Let % satisfy a probabilistic inequality identical in form with the first equation of (3-15) 
and let y be any positive quantity, fixed once for all. Let (presently to be defined further) 


be so large that 
e y>w, [*. 
1” 1 A Sy 
Then E B (a - 2° 12619 = 5 + ) ae-2 lL (3-16) 
N j=1 N\iuley lal<y 


When | x; | > y.(i-e. in 2”), 


|a,-%|°—|a,|° = +c%|a,-0z|*" (0<0<1), 
He) |v 
so that Prob | | |x, —% |°—| a; |*| <eu(y—w, /4) Nite l—e. (3-17) 
When | x; | <+ (i.e. in 2"), given 7 arbitrarily small and positive, x can be found so that 
|| a; —%|°—| a, |*| <9, (3-18) 


when |%|<w o, 


since | x |° (c> 0) is uniformly continuous in 2”. We then have 
Prob {|| z;—% |°—| 2, |°|<y}>1l—e. (3-19) 








222 Testing for normality 
Combining (3-17) and (3-19), it may be inferred that 


<eo(y—o[M)" [#2+9} > 1-22, (3°20) 


the first term of the upper limit in { } tending to zero as n tends towards infinity, and ¢ and 7 
being arbitrarily small 

We have accordingly shown that the numerator and denominator of a(c) tends in pro- 
bability towards those of a,(c). Hence a(c) tends in probability towards a,(c). Hence, using 
the lemma cited at (3-11), the total frequency of a(c) tends towards that of a,(c) which tends 
towards normality as n tends towards infinity. Finally: 

If c>0 the frequency distribution of a(c), given by (3-1), tends towards normality as n tends 
towards infinity provided that w,,, where k is the greater of 2c and 4, is finite. 

It seems likely that an analogous theorem can be proved for 0 >c > — 4; we shall not, how- 
ever, be concerned in this communication.with negative values of c. 


1 cS 1 
Prob {| 72 |aj-z [22 | ie 





4. MOMENTS OF a(c) FOR NORMAL SAMPLES 


While it will be shown in later sections that, with indefinitely large samples, ./b, and 5, are 
the most efficient tests of asymmetry and kurtosis, respectively, it by no means follows that 
other tests are inefficient or that they may not be useful supplements in cases in which the 
prime tests are indecisive as to the probable non-normality of a given sample. It is accord- 
ingly proposed to give here close approximations to the first four moments (from the origin) 
of a(c) (given by (3-1)) for normal random samples of n. 

For normal samples (R. A. Fisher, 1929; R. C. Geary, 1933) 


cs fl n * k 12. iP tek 
Halo} = Blaloy} = BLES |2.—zI") /Bl* S way] (£1) 
i=1 Ninl 
The exact value of the denominator is, of course, known, for 
1 a n — 1\t* : 2\t" (n+k’'—3 n—3 
Arey (CS) eG) Wp we 


since, as usual, (n — 1) s? = 2(x;—Z)*. It will be useful to expand log, Hs* with k’ = ck using 
Stirling’s formula in (4-2): 


tl, 2 n+k'—3 n—3 
k’ » Ba 
log, Es 3 oe i tog ( 5 )t log ( 5 )t 


(k’2—2k’) k’(k’—1)(k’—2) _ k’'%(k’— 2)? -’'(k’ — 1) (k’ — 2) (3k? — 6K’ — 4) 














4(n—1) ~=12(n—1)2 24(n—1)3 120(n —1)* 
k’2(k’ — 2)2 (k’2 — 2k’ — 2) _ K(k —1) (k’ — 2) (3k’4— 12k'3 + 24k’ + 16) 
60(n — 1)5 252(n — 1) 
k’2(k’ — 2)? (3k’4— 12k'3 — 4k’? + 32k’ + 32) 
336(n—1)° , Si 


which checks for k’ = 1 to (n—1)-? with Geary (1935, p. 354). Take 


oe) == 5 lal’ (4-4) 


with 7 = x oe. z. 





Th 
ain 


de 


ing 


4°3) 


4-4) 





R. C. Geary 223 


The moments of v(c) will be found exactly as in the case of c = 1 (Geary, 1936) from the 
single or joint normal frequency distributions of (z,, z,, ...). We find 


Ms(o(e) = (==) "()1, 
Myo(e)} = FOR (=) 14 Zaha —ay-e(n— aero (4) (45) 


«(+a ) a) tal) a) eal) 2) a) +}. 


(4-6) 





For the third moment we write 
. N ay 3n(n—1 n(n —1)(n—2 
M;{ve)} = Bfo(e)}* = 7 |x; [+ PA) Be, |» [ze MO?) Bo fe fea Leal? 
= A,+A,+As, (4-7) 
denoting the three terms on the right by A,, A, A, respectively. Then 


1 /3c— 
i nie eeaet —T) te n—-He+80), 
A, ra 5 *)(an=T 1) n 


ee SS *)1n- 2)iBc+1) (m — 1)-He n+ 


fi (20+ 3) (2¢+1)(0+3)(6+1) +..| 








2!(n — 1)? 4!(n —1)* 
bx (*) (n — 3)¥8e+2 (m, — 2)-4Ge+D (n — 1) Al (SZ) Tf fit fr 
(c+1)*  (c+1)®(C+3)(7e+9) (¢+3)?(c+1)8 
~(a—2pt 8(n — 2)! 2(n —2)8 
4 (6+3)? (c+ 1)? (61c? + 310 + 265) +..| 
240(n — 2) 3 
Similarly, for the fourth moment, 











4n(n — 


i{o(0)} = Beole)}* = 5B |e, [+ M2 Bl a [ale 


+S Bal lal 
nt 


4 MO= WH 20-9) pf 


6n(n —1)(n—2 
(mT) Bh eg] zal 202 








| za |° | 25 |° | 2 |° 
= O,+CO,+0C, +C,+C, (4-8) 
2% 4c—1 
with C= - = (n— iene"), 
we+1) —1\ /c-1 
0, = = (n—2yn0e+9 (x —1-tn-a (E*)(2S*)y 


(3c+1)(c-+1) . (3¢+8)(8¢+1)(c+3)(c+1) 
x {i+ 2n—1)? * a(n — 1) mt | 

















224 Testing for normality 


“gre Qe—1\ Bf. (2c+1)®  (2¢+3)2(2c + 1)2 
0, = 2 (n—2yHtern (n—1)-*n ( =)t| \+ seo ee es 








C, = = (n— 3) (n—2) Met (g — nt(*> yh (S ‘) ‘ 
1 (c+1)(5e+3) (€+1)?(2c+1)  (¢+1) (578 + 227c? + 255c + 81) 
x{ ae Crh 24(n —2)8 














(2¢+ 1) (c+ 1)* (6+ 3) (5e+9) | } 
a —@(n — 2)° anne 





1 BE+U?_Me+1)* | (+1) (Te? + 21e4 15) 4(¢ +3) (C+ 1)8 (2c +3) 
+ *(m—3)? (n—3)> (n— 3) is (n—3)5 





4 (c+ 3) (c + 1)? (122c* + 671c? + 1070c + 525) 
15(n— 3) sa 


Formulae (4-5), (4:6), (4-7) and (4-8) were checked from the corresponding formulae for 
c = 1 given in the author’s 1936 paper. 

From the following section it will be apparent that for indefinitely large samples the most 
sensitive test of kurtosis of the field a(c) is found for c = 4. At the same time it is shown that 
there is really not much difference in efficiency for values of c in the range 5 > c > 2; moreover, 
the results in § 6 (in which the efficiency of the tests for c = 4 and c = 1 are compared from 
the power function viewpoint) suggest that, for samples of moderate size, the superiority, 
if any at all, of a test using a(4) = b, over other tests in the series may be even less marked. 
The disadvantage of a(4) is that its frequency is not known for samples of all sizes; and if we 
could estimate, with any degree of confidence, the probability points of a(c) for any value or 
values of c > 2 for medium-size samples we might, for practical purposes, dispense with a(4) 
altogether, since, while we now know one way of solving the problem of determining the 
exact, or almost exact, frequency distribution of a(4), it must be admitted that the method 
is extremely tedious. (From the theoretical point of view, however, the a(4) problem must 
be solved since it remains a challenge to the mathematical skill of statisticians!) It will 
accordingly be of interest to study the order of magnitude of tue semi-invariants of a(c) 
for c near 2. 

Consider the case, for example, of c = 2-4, not by any means, it is important to 
observe, the lowest value which would be used for tabulating. In Table 4 the first three 
moments dre given for n = 25. The L’s represent, of course, the semi-invariants. The values 
of the functions for a,(c) (given by (3-2)) for n = 24 (i.e. the appropriate number of degrees 
of freedom for comparison with a(c)) are also given. These show that the moments of a,(c) 
are very close to those of a(c), which suggests that, when 7 is not less than, say, 20, the values 
of B,, B, and corresponding functions of higher orders, if required, for a,(c) could be used 
for the determination of the probability points of a(c). This is important from the com- 
putational point of view because the algebraic expressions for the normal moments of a,(c) 
are exceedingly simple whereas it must be conceded that (4-8) offers a grim prospect for the 
computer; furthermore, tke principal term C; is rather slowly convergent unless n > 50 or so, 





wh 


of 


ar 


ie i aa de 


PF 3 eS ee 





R. C. GEARY 225 


whereas exact values for all values of n can readily be found for the moments of a,(c) for normal 
samples. 


Table 4. Normal moments, etc., of a(c) and a,(c) for c = 2-4 








a(2-4) a,(2-4) 
n 25 24 
Mi=L, 1-166252 1-1662524891 
M3 1-362004 1-362091186 
M3 1-592841 1-593151615 
M,=L, 0-001860 0-001946318 
M,=L, 0-000063 0-000069583 
JB,=L,/L} 0-80 0-8104 

















As with (4-1) for a(c), the moments (from the origin) of any order of a,(c). is the quotient 
of the moments of the same order for numerator and denominator, assuming that the 
universal mean is zero and the variance unity. Since the different members x; of the sample 
are independent—the difficulty with a(c) is that the (x;—Z%) are not independent—for the 
moments of the numerator of (3-2) we require only 


E\z|* = ie [- axa et = (orm *) ‘ad (4-9) 
(27) J 0 a He? & 
and for the denominator . 


oo -a(ina)* QI C=AYCH wm 


The case of c = 4 is particularly simple. The first four semi-invariants are as follows: 














. 3n 
L, gs M; = (w+2)’ 
24n* (n — 1) 

by = Ma= Ga (n+ 4 (n 46)’ (4-11) 
Pe ge 1728(n — 1) (n—2)n8 

oS (a+ 2)8 (0 + 4) (0 + 6) (n+ 8) (n+ 10)’ 
1, = 10:368n4(n— 1) (30n* + 168n* — 608n? — 2672n + 3712) 

4" (m+ 2)8 (m+ 4) (n+ 6)? (m+ 8) (m+ 10) (nm + 12) (n+ 14)° 


Moments, etc., for a,(c) for normal samples of 24 and 50 are contrasted for c = 2-4 and 
c = 4 in Table 5. The contrast between the values of ./B, and (B,—3) respectively for 
a,(2-4) and a,(4) is striking in the extreme. Even for n = 24 ,/B, [a,(2-4)] and B, [a,(2-4)] 
are approaching the values at which a Gram-Charlier approximation to the frequency 
distribution may be reasonably convergent. Furthermore, the decline in the values of the 
B’s from n = 24 ton = 50 is marked for a,(2-4), while the decline in the B[a,(4)] is very slow. 

It is accordingly suggested that a table of probability points (perhaps 0-001, 0-01, 0-025, 
0-05 and 0-10) of a(c), for c equal to, say, 2-2, be prepared for n > 25 on the assumption that 
Gram-Charlier applies throughout. For this purpose the values of the mean and variance 
for n at intervals of, say, 10 should be computed from formulae (4-5) and (4-6); the B, and 
(‘B,—3) should, however, be computed as for a,(c). For lower sample sizes it might be well 














226 Testing for normality 


to use terms to order n~* which would render necessary the use of the fifth and sixth semi- 
invariants of a,(c). The formulae given by E. A. Cornish & R. A. Fisher (1937) (assuming 
Gram-Charlier) could be used to find the probability points. On account of the minuteness 
of the variance L, for c near 2 it will be necessary to work to many places of decimals—at 
least 10. As stated at the outset, the test of kurtosis a(2-2) will be only slightly less efficient 
than a(4) and it may be slightly more efficient than a(1), the probability points of which are 


known approximately for samples of all sizes. In any case the a(2-2) table would be a useful 
adjunct to that of a(1). 


Table 5. Normal moments, etc., of a,(c) forc = 2-4andce = 4 











n=24 n= 50 

c=2-4 c=4 c=2-4 c=4 
Mi=L, ,» 41-1662524891 2-769231 1-1721603127 2-884615 
M,=L, 0-001946318 0-559932 0-002058462 0-359550 
M;=L, 0-000069583 0-752488 0-000022251 0-343337 
L,=M,-—3L3 0-000004921 1-955999 0-000000919 0-711375 

/B,=L,/L} 0-8104 1-7960 0-6462 1-5925 

B,-3=L,/L3 1-30 6-24 0-82 5-50 























In an earlier paper (1935) the writer suggested that the correlation between 6, and a(1) 
for normal samples gave some indication of the relative efficiency of these two tests of 
normality. In this order of ideas it seems desirable to compute the approximate value of 
the correlation coefficient between a(c) and a(c’), where c and c’ are any two positive con- 
stants. In the first instance the universe from which the sample of n was drawn was not 
necessarily normal. Since in the present application we will be concerned only with large 
samples we assume the universal mean known (and accordingly it may be taken as zero, 
i.e. A, = 0), so that, instead of a(c) we use, in reality, a,(c) given by (3-2). In the remainder 
of this section we write a for a,(c) and a’ fox a,(c’): 


a= (212. i) /(za)°. (4-12) 
a’ = (Zl ') / (7 22)". (4-13) 


Set Yi = (| Xi |°— Me) /Mrey> 
Ys = (| %5|° — Meher» 
2; = (%3—Ha)/Me, 








; (4:14) 

> Hie 18°, > sdlioas Me HX } 
e+e’ O(C +1)(C+2)...(C+k-1) 

laedins, wi. C, = ki! . 





aa 1 hay 1 —Ke+e) 
Then — (. +7 2u) (1 +2 2yi) (: +e) (4°15) 





fe 





13) 


14) 


15) 





R. C. Geary 227 


The mean value of aa'/aa’ was found approximately (i.e. to terms in n-*) by formally 
expanding the last factor in (4-15), multiplying by the first two factors, and setting down 
the mean value term by term, so that 


M}|aa' = Eaa’|ao' = (1 es i C.nEz*— BS Cyn EZ 


1 6nn—1 1 — 
+74 C, (nzz sibs ca pees Es) — 75 Co(LOnn —1 E2#Ez*) 


1 —ln-2 C. 
+H C,90 vet NP Eat} + - Sy (nEyz+nEy'z) + a3 (nEyz* + nEy'2*) 


Pe: - [n( Eyz* + Ey'z’) + 3nn—1 Ez*(Eyz+ Ey’z)) 


+a [4nn—1 Ez*(Eyz+ Ey'z) + 6nn—1 E2*(Eyz*+ Ey’2)] 


30C,nn—1n—2,,, are . 
—— ees haan 2*( Hyz+ Ey 2)|+{SanBivy — 73 hyy'z 


mn: os [nEyy'z? + 2nn—1 Eyz Ey'z+nn—1 Eyy’ E2*| 


= S [nn—1 Eyy’ Bz? + 8nn—1 (Eyz*Ey'2z+ Ey'2*Eyz+ Eyy'zEz)) 


+o aan Eyy' E*22 + 12nn—1n—2 Bycky=E=|| : (4-16) 


The £’s in (4-16) are readily calculable from (4-14), e.g. 


Eyy’ = Eyy, = E\| x; |°— Me) (| %¢ |" — Med /Prertrer = (eter! Pres trer) — 2. 


It has been verified that when c is substituted for c’ in (4-13) the formula agrees with that 
for the second moment of a,(c) given in § 6. 
The coefficient of correlation is, of course, 


Re <a Moy|s{( Moc Mee’)s (4:17) 
with M, = Mi, — M, MM}. 
Formulae for the first and second moments, to the approximation required, for the com- 
putation of (4-17) are given in § 6. 
As an application, the following are the values of the variances and the covariance for the 


test of normality a(1) and (6,), ie. in which c and c’ have respectively the values 1 and 4, 
and where the universe belongs to the Pearson system with A, = 1, A, = 0 and A, = 3: 


M,, _ 9°09313705 0-262961 __ 0-196477 











His n né - n® ; 

M.,, _ 44286 92-25 831-2 , 

ay ME et ae ei 
ce’ 


My _ _0°491 4:87 281-5 
Preiic' " n* n’ 




















228 Testing for normality 


From (4:17) and (4-18), R,(n=100)= — 0-826 and R,.(n=0o) = — 0-764. It is of great 
interest to find that, though the universe is markedly non-normal the correlation for in- 
definitely large samples is practically identical with the normal theory value of — 0-767 
(Geary, 1935), another indication, no doubt, that normal theory inferences can usually be 
applied with confidence when the parent universe is not markedly unsymmetrical. 

When samples are indefinitely large we find, from (4-16) and (4-17), 


Ry = sherei— 2(CHreiMres21 + ©’ MrerPicsai) + (CC' ty —C— 2.07 — C= 2) eshte 
J (M,,Me~) 


where, of course, the values to be taken here for M,, and M,, are found by substituting 
respectively c’ for c and c for c’ in the numerator. When, in addition, the parent universe is 


normal, we find ; me ae 
(ES) (GG) ES) 


eam cen ea em ead ee 


which reduces to — t/,/{12(7—3)} for c = 1, c’ = 4, as it should (Geary, 1935). The following 
section will accord 6, (i.e. a(4)) a decided primacy amongst tests of normality when the 
samples are indefinitely large. It may, therefore, be of interest to give the values of the 
correlation coefficients (for indefinitely large normal samples) between b, and a(c) for 
selected values of ¢ (Table 6). The table suggests, in the high coefficients of correlation, 
except for c very near 0 or 2, that all the a(c) should be reliable tests of kurtosis, with no great 
difference between their efficiencies. The efficiency of any two tests would be identical, in 
the conditions stated, if the coefficient of correlation between them was + 1 because then, 
of course, they would be functionally, and not stochastically, related. 





(4:19) 








(4:20) 








Table 6. Correlation between b, and a(c) for indefinitely large normal samples 











Value of c Value of R®, Value of c Value of Re, 
0 0 3 0-980 
1 — 0-769 4 1 
2 0 5 0-983 
2-2 0-887 6 0-939 
2-5 0-952 foe) 0 

















5. THE MOST EFFICIENT TESTS FOR INDEFINITELY LARGE SAMPLES 
In this section we consider the efficiency of tests of kurtosis and asymmetry from the view- 
point of indefinitely large samples. 

By definition a test will be regarded as valid, in relation to a field of continuous alternative 
universes including the normal, if its value for infinite samples drawn at random from the 
normal universe is different from its value for infinite samples from other universes of the 
field. As the sample number increases the test will become increasingly discriminatory of 
the normal as distinct from other universes of the field. This increased sensitivity might be 
given mathematical expression in some such terms as the following: given a probability « 
(say 0-01), the normal universe W, of the field and any other distribution W, of the field, 





2n 





R. C. GEARY 229 


a number ; can be found so that for n >n, the mean value of the test function for samples 
of n from W, will lie at or beyond the a probability point of the test function for samples of n 
from W,: the smaller n, the more sensitive the test. 

We consider, then, the infinite field of alternative tests of kurtosis represented by (3-1) 
when c assumes all positive values, and the infinite field of alternative universes represented 
by the Gram-Charlier frequency 


sare 34(-8))~* o 


The universal variance is assumed to be unity, without loss of generality. The normal 
universe is a member of the field: it is found when all the A; (¢ > 2) are zero. We assume that 
the conditions of § 2 are satisfied so that for indefinitely large samples the frequency dis- 
tribution of a(c) for all parent universes is normal. Obviously the efficiency of any particular 
test (i.e. a(c) for a particular value of c) in regard to the normal and a particular non-normal 
alternative (i.e. a Gram-Charlier frequency with particular valués of the A,) will be adjudged 
by considering the ratio of 

(i) the difference between the universal mean values of a(c) for the normal and the 
particular non-normal parent universes; to 

(ii) the standard deviation of a(c) for indefinitely large normal.samples. 

The most efficient test will be a({c) for c a theoretically ascertainable function of the given 
A; which makes the ratio a maximum. 

For indefinitely large samples the mean value ¢ of a(c) when the parent universe is given 





by (5-1) is , 
1 re) f da\t ~ 
$= Jam | __del# exw [ 23 (-z) J wise 
ro) d \2m+1 
Obviously | dx | x |* (-z) e-* = 0, 
Also, when m>1, 
- d\™ ., ([c-1 Py 
| dz | x |* iz) ° iz? = oe 1 2Ke+) e(¢ = 2) (c— 4)... (c— 2m+ 2), (5-3) 


a result readily inferable from the obvious fact that the left side vanishes for c = 0, 2, ..., 
2m—2. Accordingly 


c—1),2HerD; aR A 
b= (F) 1 Jam (I+ ahele—2)+ (B+ apt) ele- 2040+... (5-4) 


The normal value is given by the first term. 
From (4-3), (4-5) and (4-6) it is evident that the value of the standard deviation, for larger 
normal samples (retaining only n-*) is 


0 P(e GE) eo 


The principal term in the deviation ¢ —¢° (where ¢° is the normal value), from (5-4), is 


5 — Her UW! Aye(o—2) 
5 jn Gai amt 





(5-6) 








230 T'esting for normality 


To a constant factor, the ratio 6/o is given by the first discriminant 


(=>) yn _ +2 ¢ 























p(c) = c(c— 2) ay ) 5 (5-7) 
> ' 
It will now be shown that ee) = 0 forc = 4. 
The discriminant may be written in the form 
2 oe 
ple) = (e—2)(=#-= 5) °, (5-8) 
c 
4a 
where L= | cos’ dé, (5-9) 
0 
p(c)_1 33! (eee py) Fe-F| 16 
= pc) cc 3 |" L ba 4 . i Re stu: 
in 
From (5-9) J,= I= } d6 log’ 6 log cos 0. (5-11) 
0 
From a fairly well-known property 
in 
Jy = | d6 log cos 6 = — 4m log 2. (5-12) 
0 


In (5-10) we shall be concerned only with even positive integer values of c. We have at once 


5 
l 7 ee 37 57 357 


=g A=g hnmye b= 3° = a56- on 


in 47 
From (5:11) 4J,,= | d@cos*@ logcosé = d(sin @) cos**-" 8 log cos 0, 
i ’ 0 


which, by partial integration, 
i” castionns 2c—1 } 
-| dé sin 6 (2c — 1 sin 6 cos*-* 6 log cos 6 + ae | 
0 cos @ 
= (2¢— 1) (Joe_g— Soe) + Toe-2 — Loe: 
Hence 2d y, = (2c — 1) Japs — Lo, + Loe-2- (5-14) 
From (5-12), (5-13) and (5-14), 


Jy = — jr log 2, J, + (— 607 log 2 + 377)/384, 
J, = (— 2m log 2+7)/8, J, = (— 8407 log 2 + 5337)/6144. (5-15) 
J, = (— 127 log 2 + 77)/64, 


Noting that J,, = 2J,, and substituting in the right side of (5-10) the values of J and J given 
by (5-13) and (5-15), we find p’(4) = 0. Table 7 gives the values of the discriminant for certain 
values of c. 

The discriminant accordingly assumes a maximum value for'c = 4, a result so remarkable 
that one might be inclined to suspect that it is a consequence of the form which was assumed 
for the alternative to the normal curve, a form which, in placing such emphasis on A,, 





+19) 


5-11) 


5-12) 
once 


5-13) 


5-14) 


B-15) 


riven 
rtain 


cable 
imed 
n Ay, 





R. C. GEARY 231 


high-lights, so to speak, b, (= A,+3 when A, = | for indefinitely large samples) as a test of 
normality. From the algebraic point of view this is anything but obvious: the property. 
emerges from quite a complicated piece of algebra. It may also be emphasized that the field 
of alternatives (5-1) is not arbitrary; it is a general form of frequency distribution when all 
the A; are finite. Admittedly the discriminant takes account only of the term in A, in the 
expansion; but this is certainly the most significant term for a wide class of frequency dis- 
tributions, namely, those of homogeneous symmetrical functions of samples of n as n tends 
towards infinity under very general conditions for the parent universe, provided that the 
resulting frequency distribution can be assumed to have its third moment zero; for then the 
only term in n-' in the frequency distribution of the function will be the term in A,. The 
significance of the property demonstrated must not be overstressed since it is subject to 
many qualifications, but it gives strong grounds for holding that, for very large samples, 
b, is the most efficient test of normality of tests of type a(c) in relation to a very extended 
class of alternative universes. At the same time Table 7 shows that there can be little 
difference in éfficiency in the field a(c) for c ranging from close to 2 to about 5. There is but 
little doubt, on this showing, that b, is more sensitive than a(1), a conclusion suggested on 


the basis of certain experimental results by E. 8. Pearson (1935) and examined from the 
viewpoint of power function theory in § 6. 








Table 7 
: Discriminant Discriminant 
0<c<2 -ple) 2<c<@ p(c) 

+0 — 2-334 2+0 4-460 
0-1 — 2-541 2-1 4-508 
0-2 — 2-725 2-5 4-666 
0-5 — 3-188 39 4-801 
0-7 — 3-441 3-9 4-898 
1-0 — 3-758 4-0 4-900 
1-1 — 3-851 4-1 4-898 
1-5 — 4-166 5-0 4-818 
1-9 — 4-405 6-0 4-602 
2-—0 — 4-460 7-0 4-288 
8-0 3-906 




















Adverting to (5-4) in conjunction with (5-5), it might be asked if, on the analogy of the 
maximal property just demonstrated for the first discriminant, the function 


2c—1 ae 
( 2 ) iva _ +2 


ey * 
> 
has a turning point at c = 6. The answer is in the negative. The value of p3(6)/p,(6) is, in fact, 
15/34. At the same time there must be a zero of p3(c) very near c = 6 since 
p2(5°9) = 8°79, pa(6) = 9-20, . p_(6-1) = 8-56. 


Analogous to the field on tests of kurtosis represented by (3-1) we may consider as a field 
of tests of asymmetry: 





pale) = e(¢—2) (c—4) 





te 
g(c) = a{—2" |2,—-% + 2"(e,— 299 | {7 0-2 f (5-16) 


Biometrika 34 10 








232 Testing for normality 


where 2” extends to the cbservations z; less than the mean Z and 2” to the rest of the sample. 
For c = 3 the test is, of course, /b,. For normal samples 


E{g(c)}* = E -=2" |x,—% +z zre,—zy) [Ee Fze—2)"", (5-17) 


the denominator of which is identical with the denominator of (4-1). Knowing the joint 
distribution (for normal samples) of (7, —Z), (,—2), ... (Geary, 1936), there is no theoretical 
difficulty in finding the mean values of the terms of the numerator for positive integer values 
of k. Here we shall be concerned only with the first and second moments, i.e. those for (5-17) 
for k = 1 and k = 2. We require the normal ‘jistribution of z, = x, — Z and the joint distribu- 
tion of z, and z, = x,—. These are 


eas (Bs) ex | agen 


~1)(22+22 
(2,25): 55 (a3 3) ex p{- (n ie a 23) _ as dz,dz_ = f(z;,%_)dz,dz,. (5-18) 


Clearly the odd normal moments of g(c) are zero. Then 








B{-=2"|x, —z\°++ me (te -7y =<2B| z,|*+~ mine =) Bye,24), (5-19) 


where £,(z;, 22) is the mean value of the two-dimensional terms. We then have 
0 0 0 a) 
Bye) =| (-alede[" (—adeaflewen)— [day [” deyeafle) 
co 0 - oo cs) 
se f. de, [ i d29( — 22)" f (21,22) + 3 dz, 24 [ dz, 2 f (21,22) 


7 Ic 8 4 Adz, def f(—21, — 2) —f(— 2,22) —f(%, — 22) + fl, %2)} 
~~ (ss) Ss G)) 








an \n—2) (n—1/**\2 3!(n — 1)? 5!(n — 1)4 (5-29) 
Exp = =" (2) = (5-21) 
mame CICS 


We now have all t... expressions required for the variance of normal g(c). We require, for 
what follows, only the term in n- which is 


o = | => )\5- fae 5:23) 
1% 2 }'Jjm \2*) awl ( , 


Consider now a field of alternative universes represented by 





. 
ion ( + Z(et—32)} ete, (5-24) 


the ‘first approximation to the law of error’ (for universal variance unity), obviously the 
most appropriate asymmetrical field, for different values of the parameter A,, and con- 





le. 


18) 


19) 


the 
con- 





R. C. Geary 233 


taining as a member of the field the normal distribution found for A, = 0. For indefinitely 
large samples from (5-24) the mean value of g(c) is 


Oa ° aL As a5 : 
From (5-23) and (5-25) < = As r(c), (5-26) 


the skew discriminant 7(c) being given by 
- 2e—1\ (ec \-*Jn ,\*_ ee fT. .\t , 
Log-differentiating, 
re) Bl A (thats Ieee) 





| ae Ln 
| ee | Torts esl 2+) Tis \" 
—“teta __“ _, “peta OSA \] "Mata _y 5-2 
Tes (+P Tas 2e+1) \2c+1 Ly (5-28) 
Setting c = 3 and using (5-13) and (5-15), we find that 7’(3) = 0. Values of 7(c) for four 


values of c are as follows: c 1c) ec 7c) 


2 2-370 4 2-389 
3 2-450 5 2-236 


Accordingly, for indefinitely large samples the test of asymmetry g(c) is most efficient for 
c = 3, when the test becomes the familiar /6,. The margin in favour of this value of c, as 
compared with others in the range 2 <c <5, is, however, quite small. 


6. TESTS OF KURTOSIS FROM THE POWER FUNCTION VIEWPOINT 


It may be useful to open this section with an interpretation of the rest’ .s of the previous 
section from the point of view of the type of error theory of J. Neyman & E. 8. Pearson 
(1933, 1936). For this we consider two universes of the field, the normal W, and any non- 
normal universe W,, and two tests of kurtosis a(4) = 6, and a(c,) for a particular value c, of 
c. Suppose that samples are sufficiently large that a(c), for samples from all universes of the 
field, may be regarded as normally distributed. 

Given a probability a, a sample number n can be found so that the mean value of a(c,) 
from W, lies exactly at, say, the upper « probability point of the distribution of a(c,) from M%. 
Then from the results established in the preceding section the value of a(4) for the same sample 
of n from W, could lie beyond the « probability point of a(4) for normal samples of n. 
Suppose that the rule adopted was to regard as non-normal all samples for which a(c) 
lies beyond the normal a probability point, and suppose that a very large number WN of 
samples were drawn, N, from universes not significantly different from normal (defining 
‘insignificance’ in some manner) and J, from non-normal universes, so that N = N,+N,, 
where N, and JN, are not necessarily known in advance. Then using a(c,) the number of 
erroneous allocations will be approximately aN, + 4$N,, whereas using a(4) the number will 
be aN, +(4—p)N, (}>p>0), showing a definite advantage in favour of a(4). The same 
conclusion emerges whatever value of c+4 or whatever non-normal universe be taken 
for comparison. 

The type of error approach reveals the theoretical weakness of using the method of §5 
for the assessment of relative efficiency of tests of normality; namely that the proportion of 

16-2 











234 Testing for normality 


errors of judgment, even using a(4), remains large, due fundamentally to concentrating on a 
single value (the mean) as typical or representative of samples from the non-normal universe; 
it is also a disadvantage that the sample number 7, is necessarily a function of the particular 
value c, of c. The method has further disadvantages of which the principal are perhaps (i) a 
somewhat restricted field of alternative universes; (ii) the assumption that the samples were 
indefinitely large, essential to justify the normality of a(c) for samples from any member of 
the universe field. 

The Neyman-Pearson power function approach which will now be considered cannot be 
regarded as entirely free from these objections in its application to the material so far 
available from this research. It enables us, at any rate, to contemplate samples which, if not 
small, are within the range of experimental practicability. 

The problem of the relative efficiency of the different members of a field of tests of kurtosis 
a(c) will now be considered in its power function aspects: For the present purpose the power 
may be defined as follows: 

Given a probability a (say 0-01), a sample number n, a particular value c, of c and a 
non-normal parent universe W,, the power, in relation to these data, represents the frequency 
of a(c,) for samples drawn at random from W, lying beyond the a probability point for a(c,) 
computed from samples drawn from a normal universe. The greater the power the more 
discriminatory the test. Accordingly, it is in theory necessary to know the frequency dis- 
tribution of a(c) for all sample sizes, for all values of c and for all universes. Considering that 
the only frequency distribution of the field contemplated which can he regarded as deter- 
mined for all sample sizes is a(1) for normal samples (Geary, 1935, 1936), many compromises 
are necessary to give any kind of practical effect to the power concept. The compromises 
proposed are as follows: 

(1) The form a,(c), given by (3-2), is used instead of the form a(c) given by (3-1). 

(2) Only large samples are dealt with. 

(3) The field of alternative universes is restricted. 

Using a,(c), the first four moments (from the origin) of a;(c) for samples from any universe 
can be expanded without real difficulty, and so approximate frequency distributions (using 
the Karl Pearson or Gram-Charlier systems) can be obtained. As to (1), from experiments 
in a(1) and a(4) the writer has verified that, for medium-sized normal samples, there is little 
difference between the probability points (e.g. 0-01, 0-05) of a,(c) and a(c), though the higher 
semi-invariants (given n) are larger for the latter. In regard to (2) and (3) little confidence 
could be reposed in the values of the moments computed from expansions even to n-* unless 
the sample number was at least of the order of 100 when c is greater than, say, 3; and, even 
if the moments were known exactly, the empirical frequencies would be more than doubtful 
for small samples. The approach finds its main justification in the consideration that any 
errors due to these necessary compromises may be presumed to apply more or less equally 
and in the same direction to the tests of kurtosis compared; generous, perhaps too generous, 
advantage is taken of this justification in the concluding part of this section. 


f te , 
Set, then, a,(c) = (" =| 2; | / {7 zat} : (6-1) 
te 
so that ale) = (1 +52y) (1 +524) : (6-2) 


where = pg lME, Ye = (Xi |°—Me)/Merr  % = (2-H) |, (6-3) 





the 
1,2 
on t 





R. C. Geary 235 


the universal mean being taken as zero, without loss of generality. Raising (6-2) to powers 
1, 2, 3, 4, expanding to the required degree the final factor, multiplying by the first factor 
on the right, and setting down the mean value of each term we find, to n-, 


Mj /a =1- ~ {1(11) — k(02)} to {K(12) — KP (03) + 3(11) (02)] + 3K(02)?} 


+ 5 {kQ[3(1 1) (02) —(13)]+ K2[(04) — 3(02)? + 4(11) (03) + 6(12) (02)] 

~ k2[10(03) (02) + 15(11) (02)?] + 15k(02)}, (6-4) 
MiJa = 1 +— ((02) — 2KP(11) + (20)} += { — HP(08) + 3KR(02)? 

+ 2h (12) — 6h 11) (02) — (21) + KP(20) (02) + 2k2(11)} 

+ “3 {k2[(04) — 3(02)2] — 10K@(03) (02) + 15k2(02) — 2k2[(13) — 3(11) (02)] 


+ 4k2[2(11) (03) + 3(12) (02)] — 30K@(11) (02)2 
+ KP[(22) — (20) (02)] — MP[(20) (03) + 3(21) (02)] — 2411)? 
— 6k@(12) (11) + 12K2(11)? (02) + 3k2(20) (02)%}, (6-5) 


M3/a3 = 1+ = {k(02) — 3K2(11) + 3(20)} + 5 { — k(03) + 3k(02)* 
+ 3K49(12) — 9K9(12) (02) — 34921) + 3K(20) (02) + 611)? 
+ (30) — 3k9(20) (11)}+ 5 {k@[(04) — 3(02)?] — 10k9(03) (02) + 15k2(02)8 


— 3k2[(13) — 3(11) (02)] + 6K[2(11) (03) + 3(12) (02)] — 454911) (02)? 

+ 3k@[(22) — (20) (02)] — 3k@[(20) (03) + 3(21) (02)] + 9kP(20) (02)? 

— 64911)? — 18912) (11) + 364911)? (02) — K9(31) 

+ k$(30) (02) + 3k2(20) (11) 

+ 3kQ[(20) (12) + 2(21) (11)] — 9KS(20) (11) (02) — 6KO(11)3}, (6-6) 


Milat = 1 ++ (240(02) — 4440(11) + 6(20)} + 5 { — (08) + 3(02)" 
+ 4k@(12) — 12k@(11) (02) — 6K(21) + 6X20) (02) + 12K9(11)? 
+ 4(30) — 12k(20) (11) + 3(20)?} + = (HP{(04) — 3(02)?] 


— 10k@(03) (02) — 15k(02)3 — 4k@[(13) — 3(11) (02)] 

+ 8k{[2(11) (03) + 3(12) (02)] — 6OKG(11) (02)? + 6k (22) — (20) (02)] 

— 6k@{ (20) (03) + 3(21) (02)] + 18KH(20) (02)? — 12k(11)* 

— 36K@(12) (11) + 72k@(11)? (02) — 4K49(31) + 449(30) (02) 

+ 12k@(20) (11) + 12k§P[(12) (20) + 2(21) (11)} 

— 36K@(20) (11) (02) — 244@(11)8 + (40) — 4k((30) (11) — 3(20)? 

— 6k(20) (21) + 3kM(20)? (02) + 12420) (11)%}, (6-7) 


1 
2 


pe( 4; 1 -l 
where KP) = bpel}pe + Gre + a. ), (fg) = Eyfzg, 











236 Testing for normality 
the latter, of course, the same for all 7. The (fg) required for the computation of (6-4)-(6-7) are 
(11) = (Mere — HaMrei)/Hahrer> 

(02) = (4,—3)/H3, 

(12) = (Aye se)— 2iasera — Prerlta + 2pliesH3)/ Pres > 

(03) = (1g — Spiga + 2u3)/H3, 

(04) = (Ug— 40g Mg + 64 us — 3g) /Hs, 

(13) = [eye 4e) — 3a sreiMa + 3ptya sci M3 — Mei Me — Stale + 3/03) ]/MyeiH43, 

(21) = [pyoe42)— 2Hye+ 21 yer — Mal Maes — 2éici)1/ Hiei Ma» 

(22) = (Mj2044)— 2My2es-2i fa + Mreei 3 — 2resaiMici + SPie+2i Pre Pa — Bptiey HE + Mier Ma)/ Mes 3 
(20) = (M20) — Mies) / Hier» 

(30) = (Hj30\ — 3yaci Piet + 2Hici)/ Hers 

(31) = (Ais042) — 34 20+-2) Me) + 3fhe+-21 Met — Miseila + Sptyoci Hiei 2 — 3ftici Ma) | Hiei Me» 

(40) = (Ayae) — 430i Pies + SPtinei ies — 3/te1)/ Mei: 


| (6-8) 





s 


(6-8) is, of course, an immediate consequence of (6-3). The writer has checked the accuracy 
of formulae (6-4)-(6-7) by reference to the normal universe for c = 1. 

The reader will have no illusions as to the magnitude of the task of applying the foregoing 
theory to particular cases. The formulae are set down, however, in the hope that other 
researchers will be sufficiently sensible of. the importance of the theory to assist in building 
up a fairly extensive set of results. The writer has to be content, in the meantime, to consider 
the case of the symmetrical universe field given by 


reoit +31(Z)| ett, (6-9) 


when A, = }, the normal being given, of course, for A, = 0, and for c = 4 and c = 1. These 
values of c are selected because the theory in § 5 has suggested that a(4) is probably the most 
efficient of the test-field a(c), while a(1) is the only member of the field for which the normal 


Table 8. Moments from formulae (6-8) 



































c=4 c=1 
(f9) 
Normal A,=4 Normal A,=4 
(11) 4 5-428571 1 1-17021276 
(02) 2 2-5 2 £-5 
(12) 24 45-64286 3 4-88297871 
(03) 8 14 s 14 
(04) 60 138 60 138 
(13) 216 544-2857 21 44-106383 
(21) 256/3 177-71428 1-141593 1-75544898 
(22) 2,720/3 2,481-92857 7-707963 14-766814 
(20) 32/3 16-142857 0-570796 0-63834981 
(30) 352 799-142857 0-429204 0-6405182 
(31) 4,352 12,785-2653 3 5-236134 
(40) 23,552 73,250-178 _ 2-002492 
| 














R. C. GEaRY 237 





























)are distribution is known for samples of all sizes. The necessary moments (fg) given by (6-8) 
are shown in Table 8. Based on the values in this table, moments (M’) given by (6-4)-(6-7) 
of a,(c) and semi-invariants (ZL) derived therefrom are as follows. The normal values are, 
of course, known exactly but were computed for the purpose of checking the formulae: 

c = 4; normal universe 
a 
3; 3s «a * # 
— My _,_4 28,1040 1, 8 40, 1136 
9 sms nt Sn?” 9 Sn ont Sn?’ 
M;_, 2 48 1040 Ly 64 2368 
<i LE ee a REE Re 
My __, , 8, 40 3520 Ly 3840 
81 ‘nn 3n? nn?’ 81 n> ° 
c = 4; universal A, =} 
racy 
L, MM; 3°357 11-822 12-1 
; 9.5 a5 7! ea oe? 

oing 35 = 3-5 n n n 

ther M, __, 2286 _ 57-34, 776-03 L, _.44286 92-25 831-2 

ding (3-5)? n n2 n= ° (35) 2 n® n>” 

a M; |, 3215 _ 107-47 2853-89 Ly 144-61 6193-95 

(3-5) n n2 n= ° (3-5) ni? ns” 

(6-9) My _ , , 13143 , 20-49 9529 Ly _ 10,587 

(3-5)* n n® a® ° (3-5)* sn 
‘hese ' 
c = 1; normal universe 
most 





0-19947114 0-02493389 0-03116737 
— L, = M;,=0-7978845608 + —— —— + ——— - — 





_0°04507034 0-07957747 + 0-03978874 


- L 
. n n2 ns 


0-01685645 4 COSI 


=—s Zz x 
° n2 ns 





c = 1; universal A, = } 
L, _ My __, , 0°35239362 _0-159616 _ 0-745838 




















ae " n® 5 Pon 

M, one 0-79792429 0458012 1-800648 L, _. 009313705 0262961 0-196477 
iy n n® nm? hy " n® ee 
M3 1-336592 0-850081 3-239101 ZL, _0-053356 0-204164 

7s ait rr wi gd , — oe >» 

Py n n® ns Fri " * 





fy, = 0°78126197. 


Two sample sizes were considered: n = 100 and n = 500. For n = 100 and c = 4, the 











238 Testing for normality 
following are the Pearson Type IV frequencies of a,(4) when the parent universes are normal 
and have A, = £,—3 = } respectively: 
Normal: A, = 0. xk cos!?8350 @ ¢18-:015430 gy, 
tan 0 = (x—1-873387)/0-765849, (6-10) 
logigk = 3-2644596. 
kK COS%0096 @ ¢231280 dy 
tan 0 = (x—2-8522)/0-9062, (6-11) 
logyX = 1-7499974. 
The normal probability points shown in column (2) of Table 10 were derived from the fore- 
going normal frequency (6-10); the points in column (3) were derived from a Gram-Charlier 
formula (Geary, 1935). The 0-01 and 0-05 points given in column (2) are practically identical 
with those given by E. 8. Pearson (1929) for a(4), namely, 4-39 and 3-77. The powers given in 
column (4) are the aggregate frequencies lying beyond the values of the variate shown in 
column (2) on the assumption that the actual frequency was (6-11). The corresponding 
figures for c = 1 given in column (5) were based on a Gram-Charlier formula. 
Table 9. Power of a,(c) for c = 4 and c = 1 of discriminating (6-9) for Ay = 4 from 
the normal (A, = 0) at four normal theory probability levels. Samples of 100 


> 
- 

i] 
toe 











Normal theory probability points Power for frequency (6-9) with A,=4 
Normal theory 
probability oma a" 
yf via c=4 c=1 
(upper) (lower) 
(1) (2) (3) (4) (5) 
0-01 4-3836 0:7482 0-0648 0-0695 
0-05 3-7744 0-7642 0-1995 0-1979 
0-10 3-5195 0-7725 0-3163 0-3037 
0-20 3-3110 0-7824 0-4525 0-4597 























Before discussing the comparative powers in Table 9 it will be convenient to give a 
table, 11, on the same lines but for n = 500. On account of the larger sample size it has been 
necessary to change the reference-probabilities given in column (1). For the construction 
of this table Gram-Charlier formulae were used throughout—the probability points being 
determined from the E. A. Cornish & R. A. Fisher (1937) formulae—after verifying that 
for two of the probability levels, 0-01 and 0-05, the probability points for c = 4 (column (2) 
above) did not differ appreciably from those given by E. 8. Pearson, namely, 3-60 and 3-37 
(for a(4)),.based on a Type IV curve. 

The analysis in § 5 has enabled us to come fairly firmly to the conclusion that for indefinitely 
large samples a(4) was to be preferred to a(1) as a test of normality. We see from Tables 9 
and 10 that this is subject to an important qualification. Table 9 shows that the discrim- 
inating power is definitely greater for samples of 500 for a(4) than for a(1), but the superiority 
is less emphatic than might have been anticipated from §5. For medium-sized samples 
(Table 9) a(4) exhibits no superiority. Of course, these conclusions are very tentative, 
as being based upon a singie alternative and on particular sample sizes. The writer had 
proposed, in addition, to examine the universes (i) A, = 0, Ay = 1 and (ii) A} = A, = } as 
alternatives to the normal but time did not permit; he ventures to repeat the hope that other 
students will take the matter up. 








~annmtmuwta& oa f& tt tui 


mal 


10) 


| 





pated 
ve a 
been 
‘tion 
eing 
that 
n (2) 
3°37 


itely 
les 9 
rim - 
rity 
rples 
tive, 

had 
} as 
other 





R. C. Geary 239 


Table 10. Power of a,(c) for c = 4 and c = 1 of discriminating (6-9) for Ay = 4 
from the normal (A, = 0) at four probability levels. Samples of 500 











Normal probability points Power for. frequency (6-9) with A,=}4 
Normal 
probability om’ euil = ae 
(upper) (lower) * a 
(1) (2) (3) (4) (5) 
0-005 3-7062 0-773167 0-1934 0-2067 
0-01 3-6094 0-775684 0-2920 0-2790 
0-05 3-3766 0-782482 0-5955 0-5196 
0-10 3-2695 0-786058 0-7392 0-6509 


























7. CONCLUSION AND SUMMARY 

In §2 of the present paper it is shown that the actual probability of differences between 
means and variances derived from random samples on the: nul-hypothesis may differ 
considerably from the probability derived from the standard tables (compiled on the 
assumption that the universal distribution is normal), when, in fact, the universal distribu- 
tion is not normal. Accordingly, the standard tables cannot validly be used unless tests, 
based on the sample from which the inferences are to be drawn, or on a series of samples 
produced under similar conditions, have established the likelihood that the universal 
distribution is approximately normal. In certain cases—but these must be few—the nature 
of the material may, of itself, suffice to justify the assumption of universal normality. 
When universal normality cannot be assumed, the best course will be to correct the standard 
tables using, for this purpose, the moments (up to, say, the fourth) derived from the sample, 
in conjunction with the formulae given.in § 2. This procedure is, of course, open to the objection 
that the moments derived from the sample may, in fact, differ substantially from the (in 
general unknown) universal moments, so that any probabilistic inference derived using 
sample moments must be accepted with reserve. Ifb, = 3-5, say, it would be safer to assume 
that the universal value /, is 3-5, than to hope (without other evidence) that it is 3, the 
normal value; it might be 3-75 or even 4, when, usually, the standard table probabilities 
will be still further astray. It should not be difficult to construct supplementary tables 
giving very approximate corrections of the standard tables, using the moment expansions 
given in § 2, for different values of ,/8, and £,. To compute unbiassed estimates of the latter, 
R. A. Fisher’s k statistics (1929) should, of course, be used. 

It may be asked if testing for normality and, when necessary, correction for universal 
non-normality is worth the trouble. To answer this question it is desirable to have regard to 
the logical position of the statistician, concerned with drawing inferences from samples, 
whose characteristic approach may be defined as reductio ad paene absurdum: if an event is 
highly improbable it must be regarded for practical purposes as impossible. St Thomas 
Aquinas’s* famous ‘certitude of probability’ is peculiarly apt as applied to the mental 
attitude of the statistician, from two quite different viewpoints. The first is that decision, 
and action based on that decision, for which there is not certainty, but merely probabilistic 
preference, is absolute. One does not say that one has a preference of 20 to 1 for Fertilizer A 

* ‘According to the Philosopher, certitude is not to be sought equally in every matier....Hence 


the certitude of probability suffices, such as may reach the truth in the greater number of cases, although 
it fails in the minority’ (Summa lla-—llae q. Ixx, a. 2). 








240 Testing for normality 


over Fertilizer B because the differences between the yields is at or near the 5 % probability 
point of some test functions: one necessarily decides without qualification that A is better 
than B. 

The second aspect, which has the greater relevance in the present case, is that the statis- 
tician regards himself as endowed with ‘certitude’ when he knows that if he repeated an 
experiment, as to, say, significant differences jn averages, a great number of times, he would 
be in error in attributing significant difference when, in fact, there was none, in a predeter- 
mined proportion of cases. He has certitude as to the probability though his decision in the 
individual case may be wrong. What is curious is that decisions (which, in effect, are absolute) 
can be based on probability levels which vary with the temperament of the statistician from 
perhaps a conservative 0-001 to a daring 0-1. For the particular statistician the probability 
level will vary with the case: for instance, the present writer would be inclined to suspect 
non-normality near the 10 % probability level of the a(1) table, whereas he would not be 
disposed to attach significance in, say, analysis of variance, until about the 2} % level. 
Naturally the level will depend on the importance attaching to the decision. 

Since all the statistician usually requires from the table of probability for a given measure 
of significance is whether, on the nul-hypothesis, the probability is ‘small’, absolute 
precision is not necessary in the probability. If the probability is thought to be minute, say 
0-001, it does not matter if in actual fact it is 0-002 or 0-0005. If, on the contrary, the standard 
table value is approaching the statistician’s level of decision it surely matters a great deal: 
if he thinks his judgment is likely to be erroneous in 1 out of 20 experiments it must be of 
importance if, in fact, the true probability is something like 1 in 10 or 1. in 5. These are the 
kinds of contrasts that appear from §2, from comparison of standard table probabilities 
with ‘actual’ probabilities found when the samples were assumed to be randomly drawn 
from certain arbitrarily selected types of non-normal universes. The computed probabilities 
in §2 admittedly make no claim to exactitude in most of the cases, since the formulae were 
strained by their application to small sample theory. The point is, however, that the estimates 
of the actual probabilities are unbiassed in regard to the ‘normal theory’ probabilities: 
if the former could be closer to the latter, they might also be further away. 

There is one case which is in a quite exceptional category, namely that considered at the 
beginning of § 2. As far as the writer is aware, this case has never been examined theoretically 
before, despite the extreme simplicity of the algebra. It is shown that in the simplest case 
of analysis of variance, when the two sample numbers are of the same order of magnitude, 
the variance is proportional, approximately, to (£,—1), so that quite a small measure of 
universal kurtosis materially changes the probability. Statisticians must have been affected 
by a kind-of hypnosis in favour of normal theory to have overlooked so trivial a point, 
a stricture from which the writer is not particularly concerned to exclude himself! An 
exception was E. 8. Pearson (1931) who, on the basis of his results cited in § 2 (a), sounded 
a warning: ‘The illustration should serve to emphasize the fact that certain of the “normal 
theory”’ tests can be used with greater confidence than others when dealing with samples 
from populations whose distribution laws are not known.’ 

An interesting chapter could be written on the fluctuations in the attitude of statisticians 
during the past century on the question of the occurrence of the normal frequency distribu- 
tion in nature, a chapter, perhaps, in a large work on Fashions in the Sciences down the Ages. 
Amongst the following the historian may find the reasons for the prejudice in favour of the 
hypothesis of universal normality up to, say, the end of the last century: 





ity 





R. C. Gzary 241 


(1) The fact that, to a close approximation, it applie= in a wide range of mathematical 
conditions. 

(2) The fact that the theory found practical applications predominantly in assessing the 
probability of errors in astronomical meas irements and in games of chance where the 
mathematical model could reasonably be assumed to apply. 

(3) The beauty of the mathematical theory and the facility of algebraic manipulation in 
the function involved. 

(4) The general shape to the visual sense of such frequency distributions as were known, 
before x? imposed its discipline. 

With the development, about the beginning of the century, of the theory of moments, 
statisticians became almost over-conscious of universal non-normality. The concomitant 
semi-invariant approach had quite a different background. The difference between the 
moment and Karl Pearson curve system on the one hand and semi-invariants and the Gram- 
Charlier system on the other is fundamentally that for the former normality is a particular 
case like any other, whereas for the latter normality is basic and generative. Each system 
has its advantages and disadvantages as applied to the determination of frequency dis- 
tributions of which the lower moments are known. In fanciful terms one might say that in 
the ship Gram-Cha_.ier one might sail in perfect safety but-only within limited, and more 
or less ascertainable, range of Port Normality, whereas in the good craft Pearson one can 
sail the seven seas—at one’s own risk.* 

Our historian will finda significant change of attitude about a quarter-century ago following 
on the brilliant work of R. A. Fis’.er who showed that, when universal normality could be 
assumed, inferences of the widest practical usefulness could be drawn from samples of any 
size. Prejudice in favour of normality returned in full force and interest in non-normality 
receded to the background (though one of the finest contributions to non-normal theory 
was made during the period by R. A. Fisher himself), and the importance of the underlying 
assumptions was almost forgotten. Even the few workers in the field (amongst them the 
~resent writer) seemed concerned to show that ‘universal non-normality doesn’t matter’: 
we so wanted to find the theory as good as it was beautiful. References (when there were 
any at ll) in the text-books to the basic assumptions were perfunctory in the extreme. 
Amends might be made in the interest of the new generation of students by printing in 
leaded type in future editions of existing text-books and in all new text-books: 

Normality is a myth; there never was, and never will be, a normal distribution. 
This is an over-statement from the practical point of view, but it represents a safer initial 
mental attitude than any in fashion during the past two decades. 

As already indicated, the present work is incomplete, especially on the experimental side. 
The writer hopes that he has created a prima facie case for the importance of testing for 
normality. 

SuMMARY 


(i) Inferences drawn from the standard (normal) tables of z and ¢ may be seriously in 
error if the conditions in which the standard tables apply (the principal of which is that the 
universes from which the samples are drawn are normal) are ignored. 


* This comment must not be taken as applying to the problem of curve-fitting, i.e. to fitting a smooth 
curve to given frequencies, but to the problem of estimating the frequency function given the first 
few semi-invariants. 














242 Testing for normality 


(ii) Sufficient conditions are given for the approach to normality, with increasing sample 
size, of the field of tests of normality a(c) (given by (3-1)) for c>0. 

(iii) Many term expansions of the first four moments of a(c) for normal samples are given 
with practical applications designed to find the values of c for which the moments could 
be used with confidence to find the frequency distributions for medium-size samples; semi- 


invariants of a,(2-4) and a,(4) (@,(c) is given by (3-2)) are compared; correlations between 
a,(c) and a,(c’) are examined. 

(iv) For indefinitely large samples and a wide field of alternative universes a(4) is found 
to be the most sensitive test of kurtosis and an analogous test of asymmetry g(c) is found to 
be most sensitive for c = 3, g(3) being the familiar /b,. 

(v) An examination of the relative efficiency of a(1) and a(4) from the Power Function 
point of view suggests that a(4) is increasingly to be preferred as the sample size increases; 
for samples of moderate size a(1) is probably as efficient as a(4). 

(vi) Throughout the paper a considerable range of formulae is given in case students may 
feel interested to carry the writer’s researches a stage further so as to give a firmer basis to 
his conclusions or to modify them. It is suggested (§ 4) that the preparation of a table of 
probability points of a(2-2) for normal samples of different sizes be taken in hand. 


REFERENCES 
Baxenr, G. A. (1932). Ann. Math. Statist. 3, 1. 
Bartiett, M. 8S. (1935). Proc. Camb. Phil. Soc. 31, 226. 
CornisuH, E. A. & Fisusr, R. A. (1937). Rev. Inst. Int. Statist. 5, 307. 
Craie, C. C. (1928). Metron, 7, 3. 
EpEnN T. & Yates, F. (1933). J. Agric. Sci. 23, 6. 
FisHer, R. A. (1925). Metron, 5, 90. 
Fisuer, R. A. (1929), Proc. Lond. Math. Soc. (2), 30, 199. 
Fisuer, R. A. (1930). Proc. Roy. Soc. A, 130, 16. 
Fisner, R. A. & WisHart, J. (1931). Proc. Lond. Math. Soc. (2), 33, 195. 
Fritcuset, M. (1937). Géneralités sur les Probabilités. Variables aléatoires. 
Geary, R. C. (1933). Biometrika, 25, 184. 
Geary, R. C. (1935). Biometrika, 27, 310, 353. 
Geary, R. C. (1936). J. Roy. Statist. Soc. (Supplement), 3, 178. 
Geary, R. C. (1936). Biometrika, 28, 295. 
Geary, R. C. (1947). Biometrika, 34, 68. 
Geary, R. C. & Pearson, E. 8. (1938). Tests of Normality. 
Geary, R. C. & WortLEDGg, J. P. G. (1946). Biometrika, 34, 98. 
Gosszt, W. S. (1908). Biometrika, 6, 1. 
Hsu, C. T. & Lawtey, D. N. (1940). Biometrika, 31, 238. 
KENDALL, M. G. (1941). Biometrika, 32, 81. 
Nair, A. N. K, (1942). Sankiyd, 5, 393. 
NeryMan, J. & Pearson, E. 8S. (1933). Philos. Trans. A, 231, 289. 
Neyman, J. & Pearson, E. S. (1936). Statist. Res. Mem. 1, 1. 
Pearson, E. 8. (1929). Biometrika, 21, 337. 
Pearson, E. 8. (1930). Biometrika, 22, 239. 
Pearson, E. 8. (1931a). Biometrika, 22, 423. 
Pearson, E. 8. (19316). Biometrika, 23, 114. 
Pearson, E. 8. (1935). Biometrika, 27, 333. 
Pearson, E. 8. & Apyantuaya, N. K. (1929). Biometrika, 21, 259. 
PEARSON, Kart (1895). Philos. Trans. A, 186, 343. 
PEPPER, JOSEPH (1932). Biometrika, 24, 55. 
Riper, P. R. (1931). Ann. Math. Statist. 2, 48. 
Rietz, H. L. (1939). Ann. Math. Statist. 10, 265. 
SHEwuHart, W. A. & WintTERS, F. W. (1928). J. Amer. Statist. Ass, 23, 144. 
Wisuart, J. (1930). Biometrika, 22, 224. 
YasuKawa, K, (1934). Tokohu Math, J. 38, 465. 





— ae ae ee 


ion 


ay 


> of 





[ 243 ] 


THE STRATIFIED SEMI-STATIONARY POPULATION 
By S. VAJDA 


1. CONSTANT POPULATION 


Let a set of non-increasing real values py = 1, 74, ---; Pas Paz = 0 be given, and let p; 

represent the probability of a person of age 0 surviving the i following years. Further, let 

I,, 1,, ..-,, represent the numbers of persons of age 0, 1, ...,” living at time ¢ = 0. We con- 

sider then the development of such a population during the years following ¢ = 0, under the 

assumption that the probabilities p; remain the same throughout the period investigated. 

Only persons of age 0 are to enter the population, and the number of such entrants shall 

n 

be such that the total of the population is kept constant at a number H = >l,. At the end 
i=0 

of the first year the survivors of che H persons who were alive at ¢ = 0 will be (if we put 


l/p, = Ty Say) m1 ps, 8 
> 1A = ¥ pa <H, 
i-0 Pi imo 
and therefore the number of entrants at the beginning of the second year (i.e. at ¢ = 1) is 
n—1 
¢, = H- 2 Pi 


By the same argument the entrants at ¢ = 2 will be 


n—-2 
¢, = H—¢yp, —ZNPivw 


and so on; generally 
n—t 


d, = H—$1P1—Pt-2Po— ++» -— Pi Pra — BP (1) 


as long as t<n, thai is, as long as there are survivors of the initial population. For t>n 


btai 
a H = $,+$p1Pit PyaPat --- +$inPn- (2) 


We want to find an expression for ¢,, which must obviously depend on 1,,1,, ...,1,.. Now 
(2) is a difference equation for the function ¢, of t and can easily be solved. For this purpose 
consider the ‘characteristic equation’ 


2™+a"—lp, + 2"*pgt ... + LPp_i +P, = 0. (3) 


Let this equation have the roots x,, 2%, ...,%,, where 2; is a k,-fold root and z;+2,;. We have 
then as a solution of the difference equation (2) 


¢ = H,+ P(t) ai+...+ Pz, (4) 


where H, = H/Xp, and P,(t) = a, +a t+... +a,,t%-". The a,; must be found from the 
initial population, i.e. from equations of the form (1) which contain the first n numbers of 
entrants ¢,,., ...,6,. But we find by inspecting these equations, which are of the form (¢ <n) 


H = i+ QerPrt --- +9 Pert oP tM Prt ---tTaaPy 








244 The stratified semi-stationary population 
that they are equivalent to 


=$_.= A+ > $ 5207 ( —t))-1. (5) 
i=1j=1 


Hence the «,; can be fixed, dependent on the r; = 1,/p, and thus on the initial population. 
We have thus proved: 


Ifa population with an age distribution lp, 1,, ..., 1, is subject to survival rates p; (t = 1, 2, ..., 2), 
and if this population is kept constant by ¢, entrants of age 0 at the end of the tth year, then ¢, 
is given by (4), where the x, are the different roots of (3), and the «,; must be found from the set (5). 


The population after ¢ years will have the following age distribution: 


Pr PrP PrePe ++» PrnPn: 
It can easily be proved that, if the p, are decreasing (and not merely non-increasing), then 
for all the roots zx; of equation (3) we have | x, |< 1 and that any real root must be negative. 
Hence the ¢, will oscillate around their limit mat H,. The age distribution of the popula- 


tion thus tends, again through oscillations, © H,, H,p,, .-., Hp, which may be called the 
intrinsic stationary population. Obviously, if the initial population has already this dis- 
tribution, it will not alter any more and the number of entrants will be constant and = H. 
In such a case all a,; = 0, and r; = H, = 1), whatever the x, may be. 

On the other hand, if p; = p;,, holds for one or more values of i, then we may get cycles, 
and this is easily seen for the equation 2" + 2"-!+ ...4+2+1 = 0. All roots have modulus 1, 
and it depends on the initial population whether we are dealing with the stationary case or 
with periodic cycles. No tendency towards an intrinsic stationary population appears in 
such a case. 


Example. Let us assume that we have the following probabilities of survival: 


Pi Pe Ps Ps Ps Pe 
7/8 49/96 5/32 13/384 1/384 0 
The characteristic equation can then be written 
3845 + 33624 + 1962 + 60a? + 132+1 = 0, 
which has the five different roots 
—t+J-d, -t+/-& and -}. 
The initial population will be assumed to be 








ly UT ly ly ly ls 
859 1269 229 50 115 56 
which implies H, = 1000 (approx.). Therefore the r; = 1,/p; are 
To "1 ue "3 " "s 
859 1450-38 448-86 319-14 3405-42 21420-°99 
From r, = 1000+ a, 27" + a,ry*+...+a,25* (k = 1,2, ...,5) we find 
Oy | Oe a) aq Os 
—70+60i | —70-—60i 0 0 -1 











It follows that the number of entrants in the year ¢ (= 0, 1, 2, ...) will be 


#1 = 1000] 14 (- 0-07 + 0-061) (— 3+ /— Fg) + (— 0-07 — 0-06%) (— 3 —./— &) +- or |: 

















S. Vaspa 245 


These numbers are given in the first row of Table 1, which shows the evolution of the whole 
population. 














Table 1 
t 0 1 2 3 4 5 6 7 8 and after 
Age 0 859 996 1025 989 1001 1001 999 1000 1000 
1 1269 752 872 897 865 876 876 874 875 (= 1000 x 7/8) 
2 229 740 438 508 523 505 511 511 510 (= 1000 x 49/96) 
3 50 70 227 134 156 160 155 156 156 (= 1000 x 5/32) 
4 115 ll 15 49 29 34 34 34 34 (= 1000 x 13/384) 
5 56 9 1 1 4 2 3 3 3 (= 1000 x 1/384) 
2578 2578 2578 2578 2578 2578 2578 2578 2578 



































2. TWO CONSTANT POPULATIONS 


All this covers well known ground.* A new problem arises, however, when we consider two 
initial populations with two sets of probabilities of survival, say p; (i = 1,2,...,,) and 
P; (t = 1,2, ..., m9), where 9) = Py = 1 and p,,,.p,,+0. We ask now whether it is possibie to 
keep both constant by the same number of yearly entrants. More precisely: 

Let the two equations 


™% , Ns 
Lpat=0 and YHy% 4 =0 
i=0 . i=0 


have the roots 2, ...,z, with multiplicities k,, ...,&, and y,, ..., y, with multiplicities j,, ..., 7, 
respectively. No two x; or two y; are equal and no 2; or y; is zero. Under what further con- 
ditions, concerning the x’s and the y’s, can the expressions ¢, and y, then have the same 
numerical values for all integral values of ¢, i.e. 


(H,—H,) +> Pit)#4—SBit)y, =0 for t=0,1,2,..., (6) 
i=1 i=1 
where ¢, = H,+ > Pit)x with Pit) = ay +ajgt+...+ay,t% 
i=1 
— 8 a = . 
and Vi= A+ UP(t)y, with Pit) = By t+ Bigt+...+ Bit? 
i=1 


Suppose first that none of the x’s equals any of the y’s. Then it is known that the deter- 
minant of any set of equations of the system (6) is not zero. It follows that we must have 
H, = H, and all «’s and ’s = 0, hence all P,(t) and P,(t)=0. In this case the two populations 
must already be stationary and therefore identical with the intrinsic stationary populations 
which are implied by the sets p; and ,, respectively. 

On the other hand, if some of the z’s are equal to some of the y’s, say 2 = Yj, ---;%m = Ym 
and all the others are different, then we find by the same argument that H, = H, and P, = P, 
for the first m values of i, whereas all the other P, and P, are identically zero. (It is, of course, 
again possible that all the P, and P, are identically zero and that we have, in fact, again the 
two intrinsic stationary populations.) 

If all x’s are equal to the y’s, with equal multiplicities, then the two equations are equal 


* It follows, for example, from results of P. H. Leslie (1945). 








246 The stratified semi-stationary population 


and the two populations must be identical, if they are to be kept constant by equal numbers 
of entrants. 

We have thus reached the following conclusion: If we assume that the two equations given 
above are not identical, and that the initial populations are not the intrinsic stationary ones, 
then they must be such that the two equations have some (but not all) roots equal and if we 
calculate the corresponding P, and P, (see (4) and (5) of the previous section), then those 
corresponding to the equal roots must be identical and the others must vanish. This includes 
the case where the 2’s and y’s are the same, but with different multiplicities, so that the 
P, and P, do not all extend to the highest power of ¢ which would be admissible by (7) or (8) 
respectively. 

If the two populations are the intrinsic stationary ones, then the numbers of entrants will 
be constant (i.e. independent of the year) and the two constants will be equal if and only if 


zl, _ xh, 

a =p,’ 
where the first expression refers to the first and the second expression to the second 
population. 


Example. In the example used in §1 we have a, = a, = 0, and we can therefore try to 
obtain a second population which is kept constant by the same numbers of entrants as the 
first one. We construct an equation which has again the roots —}+./— and — 4, but not 
—4+./-—. Such an equation is, for example, 


480x5 + 3964 + 21845 + 6227+ 132+1 = 0, 


which has the roots —}+./—,, —}, and —7,+./—195. It implies that the probabilities 
of survival, i.e. p;, are 


33/40 109/240 31/240 13/480 1/480, 


and as the ¢, (and the r, = ¢_,) are to be the same as in § 1, the initial population must now be 


l, = r,9,. Table 2 shows the development of such a population, and it will be seen that the 
first line is identical with that in Table !. 


Table 2 





| t 0 1 2 3 4 5 6 7 8 and after 





Age 0 859 996 | 1025 989 | 1001 | 1001 999 | 1000 | 1000 
1 | 1196 709 822 845 816 826 826 824 825 (= 1000 x 33/40) 


2 | 204 | 658 | 390 | 452 | 465 | 449 | 455 | 455 | 454 (=1000 x 109/240) 
3 41 58 | 187 | 111 | 129 | 132] 128 | 129 129 (= 1000 x 31/240) 
4 92 9 12 39 23 27 27 27 7 (= 1000 x 13/480) 
5 45 7 1 1 3 2 2 2 2 (= 1000 x 1/480) 





2437 | 2437 | 2437 | 2437 | 2437 | 2437 | 2437 | 2487 | 2437 















































16 








S. Vaspa 247 


3. STRATIFFED POPULATION: TWO GRADES 


The results of the previous sections will now be used for an investigation of the stratified 
popuiation.* First, we consider a population split into a lower and a higher grade in the 
following way: 

We assume that all members of age 0 are in the lower grade only, but tat all other ages 
may share in both grades. Apart from mortality, which operates on all members according 
to their age, we assume that at every age a certain proportion dependent on that age is 
‘promoted’, at the end of the year, from the lower into the higher grade. Our problem is 
to discover whether this can be done whilst maintaining the totals in both grades constant; 
naturally the grand total of the population must remain constant. 

It is sufficient to deal only with the lower grade, as the numbers at each age in the higher 
one can be found by subtracting those in the lower grade from the total population at that 
age. Now the lower grade is depleted by mortality and also by promotions. If the probability 
of remaining unpromoted until age i is ¢;, then the probability of not leaving the grade in 
this period is p,t; = p,;, say. Since all entrants into the population are at the same time 
entrants into the lower grade, our problem thus reduces to the following: 

Is it possible to find an initial population, stratified into two grades, such that, on the 
basis of mortality described by p,;, the number of entrants every year necessary to keep the 
population constant is the same as that calculated on the basis of mortality-cum-promotion, 
described by 7,;? 

We can apply cur results in § 2 to this case by considering the lower grade and the total 
population as the two populations given. It follows that the lower grade can only be kept 
constant by that number of entrants which is necessary for the total population, if the latter 
is initially such that some of the P,(t) which depend on it are either identically zero or at 
least do not extend to the highest degree indicated by the multiplicities of the corresponding 
roots in L'p;2"-* = 0. In order to find a suitable initial population for the lower grade it is 
then necessary to find an equation 2p,;y"-* = 0 which has the roots, with the necessary 
multiplicities, which appear explicitly in ¢, as calculated from the original equation, but 
which is not identical with it. The degree of 2p,y"-* = 0 may be lower than or equal to that 
of Xp,x"-* = 0. If it is lower, then all members of the population will be in the higher grade 
at the highest age or ages. 

This condition is not sufficient, however. In view of the interpretation of the equation 
containing the ),’s these coefficients must be positive and, as the lower grade is a part of the 
whole, we must have 9; < p; for all ¢. But it is not necessary that we have also p;,, <p. 
If the opposite holds, this could still bear a practical interpretation. It would mean that 
reversions occur from the higher into the lower grade. 

If an equation with the necessary and sufficient properties can be found, then we take the 
r; = l,/p; which we had to start with and construct the initial population of the lower grade 
by writing the number at age i as 1; = 1,5; = 1;3,/p;. 

It will be seen that in such a population the age distributions change with the passage of 
time (tending to a stationary limit) but that nevertheless all entrants have the same com- 
bined prospects of survival and promotion. (Thus from the point of view of a member of 
the community his position is the same as if he entered @ stationary population. His chances 


* Cf., for the stationary case, with continuous changes, H. L. Seal (1945). 
Biometrika 34 t7 











248 The stratified semi-stationary population 


of promotion are unaffected by the changes in the age distribution of those in front of him. 
But the characteristics of the population as a whole, for instance the efficiency of the staff 
from the point of view of an employer may, of course, vary considerably.) Such a population 
will be called semi-stationary. 
Example. The population shown in Table 2 can be taken as representing a lower grade 
within the population given in Table 1. The ratios t; = p,/p; are then: 
ae te ts & «ts 
1 33/35 218/245 62/75 4/5 4/5 
Table 3 is constructed by subtracting Table 2 from Table 1 and thus shows the com- 
position of the higher grade. , 











Table 3 
t 0 1 2 3 4 5 6 7 8 and after 

Age 1 73 43 50 52 49 50 50 50 50 
2 25 82 48 56 58 56 56 56 56 

3 9 12 40 23 27 28 27 27 27 

4 23 2 3 10 6 7 7 7 7 

5 ll 2 — a 1 = 1 1 1 

141 141 141 141 141 141 141 141 141 






































4. STRATIFIED POPULATION: MORE THAN TWO GRADES 


Let us now split up the higher grade as well. We have then, say, & grades, with grades 2 and 
above forming the aggregate which was simply called the higher grade in §2; grade 1 is 
identical with the lower grade of that section. 

We assume further that promotions from any grade into the next higher one take place 
at the end of every year and that every promot e into any grade has to stay there for at least 
one year. Thus in any population the lowest possible age of grade g is g— 1. The actual lowest 
ages may be different, because the first promotion rates different from 0 may concern higher 
ages than these. The rates of promotion can be different from grade to grade, but depend 
within each grade only on the age, as before. 

We shall again investigate whether it is possible to keep the total numbers of every grade 
constant, even if the age distributions of the grades are changing. 

We have seen that the age distribution of the total population, after ¢ years, is: 


Pr PisPr, ane Pi-nPn- 
The distribution of grade 1 is, at the same time, 


do PrP» ++» $inPy 


and it is assumed that the set of p; is not identical with the set of p,. Hence grades 2 and above , 


will have the age distribution 


PiPo— Po), Prs(Pi—Pi), P-n(Pn—Dn)- 


Let us assume that ¢,_,(p,—,) is the first item in this series which is not zero. Clearly we 
have v>1. Then, as far as numbers of members (and not their individual careers) are con- 





cernec 


from : 


anc 


Th 


Th 
chi 














S. Vaspa 249 


cerned, this aggregate of grades 2 and above is equivalent to a population which has arisen 
from successive annual entrants ¢, _,(p,—,) who have been subject to rates of survival 
Proi.a— Dosa = Pn—Pn 

P,— D, : 


2 Hh EZ P,— DP, 
It must be understood, however, that ‘survival’ is here a balance between déaths and 
promotions into the grade, so that these rates may very well exceed unity. 

The number of annual entrants into grade 2 is given by 


b.(p.-B,) =| H+ EP)" |(9,-B, 


n n 
where ~,, ...,,, are the common roots of ¥ p;z"-* = 0 and >) p,y"~* = 0, with multiplicities 
i=0 i=1 


k; and j; respectively, and where the P, are polynomials whose order does not exceed either 
k;—1 orj;—1. (They may all be identically zero.) 


The x; are, of course, also roots of > (p;—D,)x"~* = 0, with multiplicities given by the 
i=v 


smaller of k; and j;. 

We ask nowif it is possible to construct grade 2 alone in such a way that its total remains 
also constant. The argument which has been used in §3 shows that this is possible if another 
equation of grade n — v can be found whose coefficients w;, say, are not larger than the corre- 
sponding q; (and w, = 1), which has once again the roots 2, ...,Z,,, with multiplicities g, at 
least. If g,+...+9,, = —v, then this is clearly impossible. If g,+...+g,, is smaller than 
this value, then we can try to find such an equation. The initial population tan also be then 
found, if we multiply the initial population of grades 2 and above by w,/q;. The grades 1, 2 
and the aggregate of 3 and above can then be constructed and every stratum kept constant, 
but with changing age distributions. 

We can proceed in the same way and find at each step whether further splitting up is 
possible beyond 3 grades, 4 grades, etc. It is seen that in general, if 5g, = n—m, and if 
grade g starts in fact at age g—1, then m+ 1 grades can exist. ' 

The smallest value of 2g; is 1, and in this extreme case n grades can be constructed, i.e. 
one less than the number of ages. The nth grade will then contain the ages n—1 and n. 
Further, since 2, is a root of x—2, = 0, the age distribution of this highest grade is 

H,+a", —(A,+o,2%) 2, = —Ayx,—a,2" 
(x, is, of course, negative). 


Example. We use again the same example as before. The characteristic equation for the 
whole population was 


a5 + fart + $805 + S22 + d30+ she = 0, 
and that for grade 1 alone 
a5 + $3a4 + £0873 4 Sta? + dso + chy = 0. 
The difference between these two equations gives the equation for grade 2 and above 
at + $a3 + ba? + br + = 0. 
This equation has, of course, the roots —}+./—7, —4 which are common to the two 
characteristic equations of the fifth degree, and also a further root —}. Now there-is 


17-2 








250 The stratified semi-stationary population 


a biquadratic equation with the three specified common roots and not larger coefficients 
(and having the coefficient of z* equal to unity), viz. 

x + $395 + Lia? + ae + shy = 0. 
The fourth, irrelevant, root is — 1/5. This equation leads to the following development: 




































































Grade 2 only 
| t 0 1 2 3 + 5 6 7 8 and after 
Age I 73 43 50 52 49 50 50 50 50 
2 18 60 35 41 43 40 41 41 41 
3 6 8 26 14 18 19 18 18 18 
4 ll oo 1 5 2 3 3 3 3 
5 4 1 —_— — ait Ea: chen with ale 
112 112 132 112 ~ 112 112 112 112 112 
Grade 3 and above 
Age 2 7 22 13 15 15 16 15 15 15 
3 3 4 14 3) 9 Qa 9 9 9 
4 12 2 2 5 4 4 4 4 4 
Be 7 1 — — l -—-- 1 1 1 
29 29 29 29° 29 29 | 29 29 29 
i 



































Analysis into further grades is impossible in this case, because the characteristic equation 


of the third grade does not have any roots apart from the three common roots of all previous 
equations. 


5. PROMOTION RATES DEPENDENT ON SENIORITY 
We still consider more than two grades, but now we will assume that the promotion rates do 
not depend on the attained age but on the seniority, i.e. on the time spent in the grade, 
instead. In the lowest grade seniority is equivalent to age, because all members were sup- 
posed to enter at the lowest age only. If we consider again the two grades of § 3, but this time 
take note of differences in seniority, we find the following pattern: 














Higher grade | 
Age peng Seniority Total | 
0 1 z-—1 
0 Pe oe 
1 Pr-rPrty Pt-1Prlty— ty) Pe-1Pi 
2 Pt-sPate $1-sPol(ts — ty) $:-2Po(to — ty) Pt-oP2 
v $:-2P ate :-2P alts —t,) $:-2Palte_s—te-s) Dy_2P alto — ty) $:-2P2 









































Note. t;=,/p; and hence ¢, = 1. 











we 


he 





nts 


Le 





al 


Po 


Pe 














S. Vaspa 251 


If we consider now promotion from grade 2 into grade 3, and if we introduce u,, the prob- 
ability of not being promoted during s years from grade 2 (uw) = 1), we see that grades 2 
and 3 (including higher grades, if any) will have the following constitution: 
































Grade 2 
Age Seniority 0 1 z—l 
1 Pt-1Pilto—t,) Uo 
2 Pt—2Palti — ty) Uo Pt-2Polto—t,) 4 
x Pt-2P altos maa t,) Uo $r-2Paltes ae tz) uy $r-2P alto ~d t,) Us-1 
Grade 3 
. 7 : 
ge Seniority 0 1 z—2 





2 Pra Palto—t,) (Uy — ty) 
3 Pr—sPal (ti, — ty)-(Uy — 4) Pt-sPs(to—t) (Uy —%,) 
+ (to —t,) (uy —Uq)] 


Cece eeeeeteeseseeseeeeseeessssese | see | seeesesdesseeeeseeseeseseee 


zx $:-2Pal (tes —ty_3) (Up—Uy) * | Pe_2Palltes—tee) (Ug—ty) | --- | Ge-ePelto—t) (to —%,) 
+ (2-3 —tz_g) (Uy — Ug) +... + (#4 —t,) (Ug_3—Us_s)] 
+ (9-4) (Us-g—Ue-1)] 


POeeECEOOOCOOOOOOOOOOO OOS Teer errr eee 





Peete esses eeeeeseeseeeeseesepess | see |  Seaeerseseseeeesesesesseses 




















It follows by means of the same argument as before that grade 2 can be kept constant if 
we can find the u,; such that the equation 


aly (ty —t,) +2" pal (t, — te) + (ty —ty) U,)|+... 


+ Prl(tn—a a tn) + (tne Pia tna) Ut+...+ (to os t,) Up] =0 
has the same roots which were common to x2" + «"—1p, + ...+p, = 0 and 


&™— Np (ty — ty) +2" pra(ty — tg) + --- + Pnllo—tn) = 9, 

which is identical with the difference of the first two equations of degree n, referring respec 
tively to the whole population and to the lowest grade. We must further insist that all u; 
must have non-negative values, not larger than 1. The coefficients of the powers of x must 
also be positive, but it is not necessary that u;,,<u,;, unless we do not admit reversions. 
If m is the number of common roots, then it follows again as in the last section thatn —m + 1 
grades could exist which remain constant under the é6peration of promotions, but that their 
age and seniority distributions change. 

Example. Dealing once more with the same example as in the previous sections, we have 
to find a biquadratic equation 


§(1 — $8) 8 + $8133 — 348) + (1-33) wo? + AH - + G- HD a + (1-9) wy) e* 
+ HS[($8 —$) + (B48 — $8) uit ($8 — HB) wat (1-H) as) z 
+ shal($—$) + ($8 —$) 4. + (HE - FH) e+ H—-HP w+ (1-H) wu] = 0, 













































































































































































252 The stratified semi-stationary population 
or, if we use four significant figures in every fraction, 
at + (05417 + 0-5833u,) 2° + (0°1973 + 0-1658u, + 0-1786u,) 2* = 
+ (001805 + 0-04274u, + 0-03593u, + 0-03869u,) x Age 
+ (0+ 0-001389u, + 0-003288u, + 0-002764u, + 0-002976u,) = 0. . : 
This biquadratic equation must have the roots —}+./—2,; and —}4. If the fourth root is : : 
called (—z), then the equation must be identical with 5 1 
(a + §a* + Har + gy) (w@ +2) = 0. 
Simple arithmetic shows then that " 
U, = 014294 1-7142z, uw, = 0-0459 + 1-9082z, 
Uz = —0-1283+2-25662 and wu, = 0-0019+ 1-9962a 
Now z must be at least 0-05685 to make wu, positive and it must not exceed 0-5, because 2 | 
otherwise the u, would exceed unity. But then w, will always be larger than w,, unless we put 4 
z = 4 which would mean w, = 1 for all i and then there would be no members at all in grades 3 s.\- 
and above. Itfollows that we must admit reversions from grade 3 into grade 2. We can then, OF eee 
for instance, take z = 0-2 and have 2 
uy = 0°4857, uy = 04275, wu, =: 03230 and finally wu, = 0-4011. ’ Nee 
The biquadratic equation becomes . 
x4 + $345 + dia? + dat+ shy = 0. gre 
This is the same as the one used in § 4, and we can again write down the changing pattern of 
the population, but this time taking also seniority into account: 
maocsish Th 
Age t=0 1 2 3 de 
wi 
1 | 73)}—|—|—|—j 73) 43 | — | — | — | 43 | 50 | — | — | — | 60] 62 | — | — | — | 82 | 
2 |12| 6 | —|—]|— | 18] 40 | 20 | — | — | 60} 23/ 12 | — | —]| 35! 27] 14] —| — | 41 _ 
3 3 2 i1j—|— 6 4 2;2)]— 8 | 15 6; 56 | — | 26 8 3) 3|;—|M ke 
4); 3;] 3} 3} 2}—j)1l1}—|—)]—J—] OJ—]} LJ}—t—] ty ayt2yir)i14d & ra: 
Sed ee Pe a ae Lod § bad BS ee be oe pe ee oe FE 1 
to 
91} 11] 6 3 1 j112 | 87 | 22] 3 | — [112 | 88/19] 5&5 | — [112 | 88] 19] 4 1 113| on 
4 5 6 7 and later | de 
l tk 
1 }40;—}] —|—] —| |] so | — | — | —] so | co | — | — | —| co] co | — | —| —| gl 
2 28} 15 | — | — | — | 43 | 27} 18 | — | — | 40 | 27] 14 | — | — |] 41 | 27) 14] — | —} 41 be 
3 10; 4] 4 |—{--]| 18] 10 5} 4)}— {19 10 | 4|—]18)] 10 4; 4 j,—/ 18) 
4 1 1j—-|—|i— 2 1 lj ilj— 3 1 tt}; l1j]— 3 1 1jij— 3| di 
az ‘oR de Ged eof tee ‘Sat feaal fa See hg a ed tat ed Se eee ce 
te 
88 | 20) 4 | — | — j112] 88] 19] 5& | — {112 | 88] 19|'5 | —]112] 88} 19] 5&6 | — ua 
k 








Ly 


f 

















S. VasDa 253 









































































































































Grade 3 
Age 0 1 2 3 
2 Fi — 1] —] — 1 — 1 2 28 — | — 1 — | 28) 8) — | — 1 — 1 is fe eas 
3 1a pH | EB OS ST eT ea A ee ee a Be SS 
4 43. S41 — eet 11 —t i t— ft 81 ta tii 22 bee ee 
5 LS KS T Pie TR Pe Peas Py a Se ee ee ee et 
ieee Pe peor Ae ee ee ae oe ke Be Se eae ee ER ei SR! 
4 5 6 7 and later 
Ss.) 1) — [—)] — | —| | BT i) | — | — 1] — | 208 st — | J] | 2 2 1 a 
3 4/5 })—[—]—] 91 41.86 [—1—1-9} €@] 64 —)—'11 O47 6 1—1—)9 
4 Ro ES tH hE — i aE SE ET ee Ba 1 Se oe 6 eee 
6. — | bee PS ED ee ee eee ee ee 
0) 713 )—]—} sot aie) 8 i] 201 i ti Ss 1—1 eet Le tae 











We find, as before, that further splitting up of grades is impossible, if the total in each 
grade is to remain constant throughout the years. 


SUMMARY 


This investigation deals with a stratified population, which is subject to (i) mortality, 
dependent on age, and to (ii) promotion rates, indicating the ratios of members of a grade 
which are transferred to the next higher grade at the end of the year. 

Section 1 concerns a population which is not yet stratified and formulae are deduced to 
calculate the number of entrants at time ¢, necessary to replace yearly deaths and thus to 
keep the total of the population constant. This number depends clearly on the mortality 
rates and on the age distribution existing at time ¢ = 0. In general the population tends 
towards a limiting age distribution, the ‘intrinsic stationary population’. 

Section 2 considers two populations and conditions are derived for the case that they need, 
every year, equal numbers of entrants to keep them constant. 

Section 3 introduces the stratified population. Both mortality and promotion rates 
depend on the age, and they are independent of the time ¢. Under certain conditions one of 
thé two populations considered in § 2 can be taken as the whole and the other as the lowest 
grade in it. It is shown how and when entries into the grade can, at the same time, replace 
both losses due to mortality in the whole population, and to mortality and promotion 
depleting the lowest grade. This can also be described by saying that the totals of both grades 
can be kept constant at the same time, although the age distributions change from year 
to year. 

Section 4 generalizes the results of the previous section for a population consisting of 
k grades. If the population is spread over ages, then it is shown that up to n—1 grades 











254 The stratified semi-stationary population 


may be possible in the most favourable case, such that they are all kept constant, whilst 
the age distributions all oscillate. Such a population is called semi-stationary. 

Section 5 introduces the case which has been of actual importance in practical establish- 
ment work: the promotion rates are made dependent on the time spent in the grade instead 
of on the age. 

A numerical example is attached to §1 and is carried through all stages to illustrate the 
results which emerge gradually in the subsequent sections. 


REFERENCES 


Les.ig, P. H. (1945). On the use of matrices in certain population mathematics. Biometrika, 33, 183. 


Seat, H. L. (1945). The mathematics of a population composed of k stationary strata. Biometrika, 
33, 226. 








an 


In 
lat 








[ 255 } 


A SIMPLE APPROACH TO CONFOUNDING AND FRACTIONAL 
REPLICATION IN FACTORIAL EXPERIMENTS 


By 0. KEMPTHORNE, Rothamsted Experimental Station 


INTRODUCTION 


The design and analysis of factorial experiments was described in 1937 by Yates in consider. 
able detail. In his treatment Yates described first the 2" system and then went on to dea 
with 3" experiments and experiments of the 2"3" type. The 2* system is capable of very 
easy explanation, but with experiments of higher order both the design and analysis became 
of increasing complexity. It is the purpose of this paper to present a general method by 
which factorial designs of the type p” may be examined, in respect of both confounding and 
fractional replication. The method will be described by explanation of the rules for the 2" 
and 3” systems and corresponds quite closely to that given by Fisher (1942). The present 
approach presents confounding and fractional replication as different aspects of the same 
process. Experimental designs suggested by Plackett & Burman (1946) are also discussed. 


THE 2” SYSTEM 


In this system all combinations of n factors each at two levels are tested. The totality of 
treatment combinations may be represented by the points of ah n-dimensional lattice, each 
side being of unit length. Let the factors be z,, 22, ...,2, and take m mutually orthogonal 
axes y, ... ¥,- The point (000... 0) will then represent the control treatment, (1000... 0) the 
treatment consisting of x, at the upper level and all the other factors at the lower level, and 
so on. The treatment effect of x, is the difference of the means of the yields of plots receiving 
x, and those not receiving z,. It is therefore the difference between the mean of the plots 
represented by points lying on the plane y, = 1 and the mean of those represented by the 
points on the plane y, = 0. The interaction of x, and 2, is the difference between the means 
of those plots represented by y, = 1, y, = 1 or y, = 0, yg = 0 and those represented by 
y, = 0, y, = 1 and y, = 1, y, = 0, ie. the difference of the means of those plots for which 
Y¥,:+Yg=20r=0 (mod 2), 
and those for which Y,+Yg=1 (mod 2). 
Similarly, the triple interaction of x,, x, and 2, is the difference between the means of those 
plots for which Yi:+Yet¥s=0 (mod 2), 
and those for which Y¥:tYetYs = 1 (mod 2). 


This process can be continued to the consideration of the interaction of z,, 79, ...,%, which 
is the difference between the mean of those plots for which 


YzatYotYst---+Y, =O (mod 2), 
and the mean of those for which 
YitYetYgt--- +Y¥n = 1 (mod 2). 


In the n-dimensional space parallel hyper-planes may be drawn containing the points of the 
lattice, such that the total yield forming the positive part of an interaction is obtained from 








256 Confounding and fractional replication in factorial experiments 


a set of parallel hyper-planes equidistant from each other. Likewise the negative part is 


obtained from another set of parallel hyper-planes, each plane of which lies midway between 
two planes of the first set. 


THE 3” sYSTEM 


With n factors at each of three levels the treatment combinations are given by an n-dimen- 
sional lattice, each side being of length two units and containing three points. The treatment 
contrasts may be described as in the 2" system with some slight modifications. 

Any contrast in the 3" system involves the comparison of three totals of the yields of 3"! 
plots, and may be represented by the comparison of the differences between the yields of the 


plots lying on three sets of parallel hyper-planes. For example, if n = 2 the lattice is as 
follows: 








WY 
0 1 2 
0 
Y2 | 
2 














The main effect of x, is the difference between the totals of yields of plots for which y, = 0, 


y, = land y, = 2. The J component* of the interaction of x, and z, is the difference between 
the totals of the yields of plots for which 


%-¥2=9, Y-Y2=1, and y,—y, = 2. 
The J ecmponent is given by the contrast between the yields of plots for which 
Yt¥2=9, Wwty,=1, and y+ = 2. 


Anticipating the extension to cases when is greater than 2, the equations for the J 
component may be written as follows: 


X,Xo(1o): yi +2yg=9 (mod 3), 

X,X3(1;): yi +2yg=1 (mod 3), 

X,Xe(L): yyt+2yg=2 (mod 3). 
If x, and x, (and therefore y, and y,) are interchanged, then X,X,(J,) is given by the 
equations y, + 2y, = 0, X,X,(1,) by y.+ 2y, = 1, X_,X,(1,) by y.+ 2y, = 2, all mod 3. But 
the equation y, + 2y, = 0 (mod 3) is identical with the equation y, + 2y, = 0 (mod 3), since 


3y, + 3y, = 0 (mod 3), whatever the values of y, and y,; X,X,(J,) is therefore equal to 
X,X,(I,). Subtracting the equation y,+2y, = 1 (mod 3) from the equation 


3y,+3y,=0=3 (mod 3), 


we get y, + 2y, = 2 (mod 3); X,X,(J,) is therefore identical with X,X,(J,). It is obvious 


from the equations given above for the J component that X, X,(J;) = X,X,(J;) for i = 0, 
1 and 2. 


* Yates’s terminology for the components of interactions is used where convenient, but it is more 
convenient to refer to J,, J, and I, as Iy, I, and I, respectively. 











is 





O. KEMPTHORNE 257 


Considering the case n = 3, it is easily seen that the second order interaction may be 
split into four parts each consisting of the contrasts between three totals. These may be 
represented by the following equations: 

(I) %+ Yet ¥Ys=9 (mod 3), 
Wt Yot Ys=1 
Yt Yet Y¥s= 
(Il) y,+2y,+ ys=9 (mod 3), 
Yt Wet Ys= 
YitWet Ys=2 
(IIT) y,+ yet2y,=90 (mod 3), 
Yt Yet2ys = 1 
Yit Yet 2ys = 2 
(IV) y,+2y,+2y3=90 (mod 3). 
Y, + 2yg+ 2y3 = 1 
Yi + 2+ 2ys = 2 


In order these have been named by Yates 
2 &;, YY, W. 
It is interesting in passing to note the relations between Z, X, Y and W for permutations of 
the order of the factors. It is obvious from (I) that Z is invariant for any change in order of 
the three factors X,, X, and X,. Interchanging y, and ys, equations (II) become equations 
(III), so that A BC(X) = ACB(Y). The following interchanges may be easily verified (using 
the equation 3y, + 3y,+3y, = 0 (mod 3) where necessary): 
ABC(X) = BCA(Y) = CAB(Y) = ACB(Y) = CBA(X) = BAC(W). 
From the equations, it is clear that Z, X and W may be computed in the way given by 
a Z = Nery, I ayy %q)}, Y= Sty, Iq, 9}, 
X = I{x,, I(x, %)}, W = I{x,, J (xq, %3)}, 
I (aq, %g) and J (2, x) being evaluated for each level of x,. The extension to the case n = 4 is 
again obvious; the main effects, two-factor and three-factor interactions, follow as in the 
above, and the four-factor interaction may be split into eight comparisons of three totals: 
IT yt Yat Yst Ye=9,1,2 (mod 3), 
Il y+ Yet Yst+2y,=9,1,2 (mod 3), 
TIL y+ Yet 2ys+ Ye=0,1,2 (mod 3), 
IV yt Yo+2ygt+2y,=0,1,2 (mod 3), 
Vi wit2yet Yst Ye=9,1,2 (mod 3), 
VI y,+2yg+ yst+2y,g=90,1,2 (mod 3), 
VIL y,+2y.+2y3s+ yg=0,1,2 (mod 3), 
VIII y, + 2y,+ 2y3+ 2y, = 0,1,2 (mod 3). 








* 258 Confounding and fractional replication in factorial experiments 


As in the case of two factors, the effect of permutations of the order on the components of 
X, Y, Zand W may be easily cbtained. The four-factor interactions may be computed by 
putting the equations given above into the following form: 


I = J{z,, Z}, V = Ifa, W}, 
Il = J{a,, Y}, VI = I{x,, X}, 
Ill = J{a,,X}, VIL = Ife, Y}, 


IV = J{x,,W}, VIII = Ix, Z}, 


where the three components of W, X, Y, Z of 2g, x3, x, (in that order) are evaluated for 
each level of 2,. 


THE p” SYSTEM 


The total of p"—1 degrees of freedom, where p is a prime, in the analysis of variance of 
a p” experiment may be split into (p"—1)/(p—1) sets of (p—1) degrees of freedom, the 
contrasts being given by the following hyper-pianes: 

y, = 0,1,2,...,p—1, 


Y, = 0,1,2,...,p—1. 
Main effects 


Y¥, + Yq = 0,1,2,...,p—1, 


= 2 eee ag 
Interactions of pairs a siilbeeitlaiiaeindiinas hia (mod ) 
Gack ka ote oad "| A I BNI SV Se ES ON p)s 


Poe eee eU UCC E SSeS ESCO SSES CeCe seers) 


Yi+(p— 1) ¥2 = 0, 1, 2, seep D— \- 
and so on to the interaction between all the factors which is given by the hyper-planes 
BY +GgYat+GgY¥gt -.. +4,Y, = 0,1,2,...,p—1 (mod p), 


where a, equals 1 and a,,a3,...,a@, each may take all values from 1 to p—1. 


SIMPLIFICATION OF NOTATION 


The p"—1 degrees of freedom in the p” system may be split into (p"—1)/(p—1) sets of 
(p—1) degrees of freedom, given by the above hyper-planes, but it is only necessary to 
specify one hyper-plane of each set of the parallel hyper-planes. 


All the comparisons may be denoted by yy}, ...,y%», the symbol meaning that the 
comparisons are given by the hyper-planes 


1 Yi t+ AgYgt..-+,y, = 0,1,2,...,p9—1 (mod p). 


Tn order to obtain an enumeration which covers all the possibilities once and once only, it is 
necessary to use the rule that the factors are always written down in ascending order— 
i.e. yFiyFiyft, etc., such that i<j<k... and that a; = 1. 








th 





of 
to 





O. KEMPTHORNE 259 


THE 3" SYSTEM IN THE REVISED NOTATION 
As an example, the 3° system will be examined in detail. The effects are represented by 
Yi» Ye» Ys; interactions between pairs, y, Ys, ¥1Y3 ¥1Ys ¥1Y3 Yas Y2yss interactions between 
all three factors, y; ¥2Y3, Yi Y2Y3, ¥192Ys> ¥i¥z¥3- Any other combination of powers of the y’s 
can be reduced to the above set. 

It is interesting to examine the interactions of the effects and interactions. In the case of 
the 2” system, Yates refers to the generalized interaction of two interactions ABCD and 
CDE say, which is ABE. The interaction of effects or interactions A and B consists of AB 
and A B? in the 3" system. 

(a) The interactions of main effects are obviously interactions between pairs of factors. 

(6) The interactions of main effects and two-factor interactions with one letter in common 
are two-factor interactions and main effects: e.g. the interactions of y, and y,y, are 


Yi¥e = ¥i¥3, and yiy} = ye, 


and the interactions of y, and y, y3 are y?y3 = y,y2, and yiy4 = yp». 


(c) The interaction between main effects and three-factor interaction are two-factor and 
three-factor interactions: 


Between Interactions 





Y; and Ysa Yi¥e¥s =YiYays» Yidsys = Yas 
y, and 9, ¥2¥3 Yi¥eYs =YrYa¥s YiYays = Yas 
yi and y:¥3¥3 | Yi¥a¥s=YYeYs YiYsys = Yay 
y, and y,¥3¥3 | Yi¥i¥s=WYeYs YiYsyS= Yas 








(d) The interaction between two-factor interactions are exemplified in the following table: 





Between Interactions 
Yi¥e and y, y¥5 ¥i¥s= Yi» Vids =Y%s 
Yi¥e and Y2¥3 Yr YaYs> Yi¥a3 = WY3 
YrY2 and ys¥3 | Ways Yi Yays = Yrs 








(e) The interaction between two-factor and three-factor interactions are exemplified in 
the following table: 


Between 


YiYe WY 


and 








Yi Yas Yr Yad Ys WY Yas 
YiYads Yi¥aYs Ys YiYs Y2¥s 
WYYs | Ws Yes | WYa¥s | Ys 


YiVaYs YiYs | Yas | Yi¥aYs Ys 

















260 Confounding and fractional replication in factorial experiments 


The interactions between two-factor and three-factor interactions are therefore two-factor 
interactions in some cases and main effects and three-factor interactions in the other cases. 
(f) The interactions between three-factor interactions are set out in the following diagram: 

















Between 
YiY24s YrYa¥s YiYays YiYaV3 
and 
YiY24s Pr YYo Ys WYs Ye Y¥p Y2¥s 
YiYays — — | Y Ys | WY Ye 
YrY2Ys 2 = WYa Ys 
YiYaVs re = 
CoNFOUNDING 


Confounding or the allocation of treatment combinations to blocks implies the allocation of 
all the points.of the lattice into p* sets, of p"~* points, such that the comparisons between 
these sets involve particular sets of p—'1 degrees of freedom. The aim of confounding is to 
reduce the effect of soil heterogeneity by reducing block size, but ensuring that the block 
comparisons have little possible practical importance. 

If comparisons A = yfiy$s... y%x and B = yf:ygs... yn are confounded, then so is their 
generalized interaction, i.e. all the products of these two, i.e. AB, AB, ..., A B?-'. For, if 
the treatment combinations for which a,y,+a,y,+...+,y, is equal to 0,1,2,...,p—1 
are put into separate blocks and also those treatments for which £,y,+f2y2+---+ArY, is 
equal to 0,1, 2,...,p—1, then (a, +AP,) y, + (@_+APQ) yo t+... +(%, +APB,) y, is equal (mod p) 
to 0,1,2,...,9—1 for all A from 0 to p—1. 

The present approach to confounding of the 2” syste.a is identical with that given by 
Yates and we proceed to consider the rather more complex case of the 3" system. 


(a) 3° system 

(1) In blocks of 3*. Any three-factor interaction may be confounded. 

(2) In blocks of 3. We cannot confine the confounded degrees of freedom to three-factor 
interactions because the generalized interaction of any two reduces to a two-factor inter- 
action and a main effect. If two three-factor interactions, y, y,y, and y, y2y, are confounded, 
the 8 degrees of freedom for blocks may be described as follows: 


D.F. 
Ya 2 
YiYs 2 
YiY2Ys : 
WiYaYs : 
8 


We can, however, choose three two-factor interactions and one three-factor interaction pair 
for our block comparisons. 








and 


Th 
an 





O. KEMPTHORNE 261 


(b) 3* system in blocks of 3? 

It is immediately obvious that we can confound two two-factor interactions and two 
higher-order interaction pairs to give blocks of nine. The important point, however, is to find 
a design confounding only three-factor interaction pairs. 

We therefore evaluate the interactions of all pairs of three-factor interactions, which have 
two letters in common. These may be derived from the interaction of y, y,y, with the four 
three-factor interactions of y,, y, and y,, which are as follows: 


Interaction of y,y¥2y3 and y,¥e¥, Yi¥e¥syi and ysyi, 
Interaction of y,y¥2¥3 and Y,¥s¥i YiYey¥s¥e and Ysy,, 
Interaction of y,y,y, and y,y¥3¥4 WiYsyi and Yy,¥3¥, 
Interaction of y,y,y, and y,y3yi YiYsYa and y.y3yi- 


Obviously there are many designs for the 3* design in nine blocks of nine plots confounding 
three-factor interactions. Those which confound four-factor interactions must also confound 
two-factor interactions. The names of the confounded interactions and their squares (each 
of which corresponds to the same grouping as the element itself) form a group with the 


identity and the equation y?= 1, for all i, and further work is presumably most promising 
on these lines. , 


(c) 3° in blocks of 9 
There is no design confounding only three-factor or higher-order interactions. If one two- 


factor interaction can be sacrificed, a possible scheme of confounding is given by the 
following table of generalized interactions: 





Between 
Yr Y2¥s YiVayt YiYsYe Ya¥3Vi 
and 
stil Yi Yo¥3 Yas Yi¥2IsYs WYiYs YaYs 
ryaet Yi V2Yi¥s YViViVsYads YiYs¥s YaUs Vas 

















This two-factor interaction is estimated by the comparison of three sets of nine blocks, 
and the accuracy of the estimate will be low. 


(d) 3° in blocks of 27 
We may, for example, confound the following: 


YaYae 
YY2Ys YiVaYsYVaYs YWrYs¥iye 
WYaYs WY2ViYsYe YWViysye 
YYVYYYS WYsYYe WwYaysyyeye 
Yas Vis YaVsYsVe Ys¥aVs Ye 


Three three-factor interactions, six four-factor interactions, three five-factor interactions 
and one six-factor interaction are confounded. If y, is omitted from all the above expressions 











262 Confounding and fractional replication in factorial experiments 


we obtain a 3° experiment in blocks of nine confounding one two-factor interaction, seven 
three-factor interactions, three four-factor and two five-factor interactions—that is, the 
design given above for the 3° system. 


EXTENSION TO MORE COMPLICATED CASES 


Extensions of the above to more complicated cases should most easily be achieved by the use 
of group theory. The confounding of a p” design in p° blocks corresponds to a group of 
}(p* + 1) elements such that all except the unit element involve at least a certain number of 
letters. For most agricultural experiments ea *!: element should contain at least three letters, 
so that no main effects or two-factor interactions are confounded. The group is an Abelian 
group and if A and B are elements of the group so are A B, A B?, ..., A B?-!. The order of each 
element is p, and if A is an element so are the first (p— 1) powers of A. This aspect is being 
followed, and it is hoped will yield results. 


FRACTIONAL REPLICATION IN THE 2” SYSTEM 


Some principles of fractional replication have been worked out over the past few years at 
Rothamsted (Finney, 1945). In the case of a 2” system, with factors a, ...a, say, a half- 
replicate might consist of those treatment combinations which form the positive part of the 
interaction A, A,...A,. Each function of the plot yields consisting of the sum of one-half 
of them minus the sum of the other half then corresponds to two degrees of freedom. 
Alternatively, each degree of freedom has one alias, and the aim in fractional replication is 
to design the experiment so that the aliases of effects which the experimenter wishes to 
measure are high-order interactions which could not possibly have practical significance. 

For convenience of presentation, we develop first the theory for the case of the 2" system. 
Suppose that of all the points on the lattice for the 2" system, only those points for which 

YitYot Yat ---+Yn = 9 
are included in the experiment. Then the points on the hyper-plane y, = 0, also lie on the 
plane y,+43+...+Y, = 0, and likewise those for which y, = 1 lie on the plane 
Yet Yst---+Y, = 1. 

The contrast which we have denoted by y, is therefore identica] with that denoted by 


YoYsY4--- Yn. Again, if we suppose that only those treatment combinations are tested which 
lie on the hyper-planes 


HyYi + %yYot... +OnYn = 0, Byyi t+ Boye+ — +BrYn = 0, 
then the points will also lie on the intersection of these planes which is given by the equation 
(0 + Ai) 91+ (A+ Pa) Yet... +(%,+f8,) = 9 (mod 2). 
The points which lie on the hyper-planes 


VWYitVa¥2t---+¥nYn = 9,1 (mod 2) 
will also lie on the planes 


(y+ Y¥s)Yr+(%et+Ve¥2 +--+ (2p +n) Yn = 0,1 (mod 2), 

(Ait+V)9¥it(Bot+Ve)¥2 9 +... + (Bn +¥n)¥n = 9,1 (mod 2), 

(y+ By + V1) Yi + (Ha + Bato) Yat --- + (Gn +Bnt+Yn) Yn = 9,1 (mod 2). 
Changing to the simpler notation, these results may be obtained by equating to unity the 
symbols corresponding to the effects which the experiment cannot measure (as only treat- 











on 


he 





O. KEMPTHORNE 263 


ment combinations of the same sign in the function giving the effect are included) and 
multiplying the symbol corresponding to a particular effect by these symbols. Thus we put 
I= yp ys? id Yi = yfrygs ae, yon = yZrt Fr ygat hs ay ysntbn, 
then the contrast yYryze... y%n 
is the same as those given by 
yntn ygat a, Aa ysntyn, yfitn yfrtys ae yf, at+Yn and yithitn ygrt bets cas yintbntyn, 
where each power is reduced modulus 2. 


2” SYSTEM WITHOUT SUBDIVISION INTO BLOCKS 
We now consider some of the possibilities of partial replication for the 2" system. The basis 
of designs with fractional replication is the choice of an identity relationship; most of the 
possible relationships are of no value, and we consider only those which yield the least 
possible confusion between main effects and first-order interactions. 


Half-replication 
n = 3. If we take I = y,y,y5, then y, = ¥,(¥1 424s) = Yi¥eY¥s = Y2¥s- Such a design which 
confuses main effects and two-factor interactions would not be of any practical use. 
n = 4. If we take J = y, y,y3y,, then the aliases are exemplified by 


Yr=Ye¥s¥e and YY2 = Ya¥e- 
Such a design would not be used unless the experimenter were confident that two-factor 
interactions were negligible. 
n= 5. If we take I = y, y.y3y,y;5, then the aliases are exemplified by 


Yr =YaYsYaYs ANd Y1Yo = YsVas- 

A half-replicate with five or more factors is feasible when there is no necessity to remove 
heterogeneity by the use of blocks, since main effects will have aliases which are interactions 
of four factors at least, and two-factor interactions will have aliases which are interactions of 
at least three factors. 

Quarter-replication 

Each degree of freedom will now have three aliases. For each value of n we give the identity 

relationship and typical alias relationships. 


n= 4. T= Yo = Ysa = YiYaYsYa; 

then Yr = Yo = YWrYsYa = Yo¥s¥e ANd = YiY3 = Ya¥s = YiYa = YVaYe- 
n = 5. T= YiYo = YsYaYs = YrY2¥sYa¥s (2); 

or I = Y,YoYs = YsYaYs = YrY2¥a¥s (0). 


(a) Gives  y, = Yn = YiYsYaYs = Yo¥s¥a¥s 2Nd Y1Ys = YoYs = YiYaYs = YaYaYs- 
(6) Gives = Yas = YiYsYaYs = YaYaYs: 


n = 6. I = YyYo¥sYa = YsYaYsYo = YrY2YsYo: 
then Yi = YoYsYa = YViYsYaYsYo = Y2Y5Ve 
and YrY2 = YsYa = Yi¥oYsYaY5Ve = YsYe- 

n= 7. T= YiYoYsYa = YaYsYoYr = ViYoYsYsYoYr; 
then Yi =Y2¥s¥q ANd YyYo = YsYe- 

n = 8. T= YrYoYsYaYs = YaYsYoYrYs = YrY2YsVeY7Ys; 
then Yr =Yo¥s¥aYs ANd Yi Yo = YsYaYs- 


Designs in quarter replicate are therefore possible when n is greater than or equal to 8. 
Biometrika 34 18 











264 Confounding and fractional replication in factorial experiments 


HIGH-ORDER FRACTIONAL REPLICATION 
In general, the existence of fractional designs of the 2” system with fraction 2”, which will 
be useful where information on all main effects and two-factor interactions is required, 
depends on the existence of a group of 2” elements, one element being unity and the other 
elements all containing at least five letters. No simple method has been found of enumerating 
such groups, but it is perhaps worth recording the following designs which appear to represent 
the greatest degree of fractional replication possible. 


(a) Eighth replication 
If we are testing ten or more factors at each of two levels, one-eighth of a replication will 


enable main effects and two-factor interactions to be estimated. An appropriate identity 
relationship is the following: 


T = YyY2YsYaYs = Yi¥2¥e¥7Ys = YsYaYsYoY7Ys 


= YWiYsY7Yoi0 = Ya¥aYsY7YoYi10 = Ye¥sYeYsYoYi10 = YiYaYsYeYsYo4r0- 


Thus ten main effects and forty-five two-factor interactions may be estimated from a trial 
testing 128 of the 1024 possible treatment combinations. 


(b) Sixteenth replication 
If we are testing twelve or more factors a possible identity relationship is the following: 
T= YyYoYsYaYs = Yr¥2YoYrYs = YoYaYsYoY7Ys = ViY2VoY0Y1 
= YsYaYsYoYr0Yi11 = Yer YsYoYr0Yi1 = YiyeYsYaYsYoY7 YsYoYi0411 
= ViYsYeYoYia = YoYaYsYoYoYia = YaYsYrYsYoYie = YrYaYsYrYsYoie 


= Y2YsYoYi0¥i11 412 = YrYaYsYoY10Y Via = YrYsY7YsYr0Yiur Viz = YaYaYsY7YsYr0YurYi2- 
In this case twelve main effects and sixty-six two-factor interactions may be estimated from 
a trial testing 256 of the possible 4096 treatment combinations. 

The extent to which these designs will be of practical value depends very much on the 
existence of a sufficient mass of reasonably homogeneous material to test the large number 
of treatment combinations without the necessity of dividing the material into smaller 
batches and usiug the device of confounding. An experiment involving say 256 different 
treatment combinations is not large by modern standards. At Rothamsted, for example, an 
experiment involving 200 distinct treatments on 300 plots has been carried out for some 
years: this experiment was, however, made possible by utilizing the elimination of the 
effects of soil heterogeneity by highly complex confounding; the design, in fact, consisted of 
three 5 x 5 lattice squares necessitating seventy-five plots, and each of these plots was split 
into four subplots. The advantages of testing twelve factors, say, at the same time under 
virtually the same experimental conditions cannot, however, be ignored. Such an experi- 
ment should have more value, other things being equal, than two distinct experiments each 
testing some of the factors. An examination has not been made of the possibilities of reducing 


block size by confounding for the above two designs, but it is probably necessary to sacrifice 
a few two-factor interactions. 


THE RELATIONSHIP BETWEEN FRACTIONAL REPLICATION AND CONFOUNDING 
It is clear that fractional replication and confounding are different aspects of the same 
process. A 2” design of 2? blocks may be described as a 1 in 2? replicate of a 2"+” design 
with no subdivision into blocks, by regarding the blocks as a 2? system in p factors. As an 





If 
nv 


will 
ed, 
her 
ing 
ent 


will 
tity 


rial 


ng: 


rom 


| the 
nber 
aller 
rent 
pe, an 
ome 
F the 
ed of 
split 
nder 
peri- 
each 
icing 
rifice 


same 


esign 
As an 





O. KEMPTHORNE 265 


example, consider the 2° design in y,, 42, ¥3,y4 and y; laid out in four blocks ef eight and 
confounding 9, ¥2Y3, ¥s¥a¥5 aNd ¥;, ¥244¥;; SUperimposing two pseudo-factors 6, and b,, the 
experiment is a quarter-replicate of a 2’ design in y,, ¥, Ys, Ya: Ys, 51, bg. The identity on which 
the quarter replicate is based is given by the equations 


bi =Yi¥2¥s, 52=Ys¥e¥s, 5b, = Y1Y2¥aYs 
or the equation T = YyYo¥sby = Ysa Y5 0g = Yi Y2YeYs 0159. 


If we examine this equation in the same way as in the previous sections, we find that the 
design depends on the fact that the aliases of the following type may be ignored: 


Yi = Yasby = YiYs¥aYs5e = YoYaYsbr5s, 
YiYo = Ysb1 = YrY2¥sY¥aYsbe = YaYs ibe. 

This example is worth pursuing. The design is frequently used with one replication only, 
the error being estimated from three-factor and higher-order interactions. We set out below 
the identity and 31 degrees of freedom together with all their aliases and their usual place 
in the analysis of variance—blocks (B), treatment (7'), or error (Z). For convenience of 
printing we denote the factors tested in the experiment by a, b, c, d, e instead of y, ¥.43Y4Ys5 


and the block factors by x and y. Capitals are used for treatment effects thus conforming 
to present usage. 


I = ABCX =CDEY = ABDEXY 

A = BCX =ACDEY =BDEXY - 
B = ACX =BCDEY =ADEXY ¥ 
AB = CX = ABCDEY = DEXY = 
Cc = ABX = DEY = ABCDEXY T 
AC = BX =ADEY =BCDEXY = 
BC =AX =BDEY =ACDEXY T 
ABC =X =ABDEY =CDEXY B 
D = ABCDX =CEY = ABEXY =" 
AD =BCDX =ACEY =BEXY T 
BD =ACDX =BCEY = AEXY T 
ABD =CDX =ABCEY =EXY E 
cD =ABDX =EY =ABCEXY T 
ACD =BDX = AEY = BCEXY E 
BCD .=ADX = BEY = ACEXY E 
ABCD =DX =ABEY =CEXY E 
E = ABCEX =CDY = ABDXY- T 
AE = BCEX =ACDY =BDXY 4 
BE =ACEX =BCDY =ADXY _ 
ABE =CEX =ABCDY =DXY E 
CE =ABEX =DY =ABCDXY T 
ACE =BEX = ADY = BCDXY E 
BCE = AEX = BDY = ACDXY E 
ABCE =EX =ABDY =CDXY E 
DE = ABCDEX = CY = ABXY - i 
ADE =BCDEX =ACY = BXY E 
BDE =ACDEX =BCY =AXY E 
ABDE =CDEX =ABCY - =XY B 
CDE =ABDEX =Y = ABCXY B 
ACDE =BDEX =AY = BCXY E 
BCDE =ADEX =BY = ACXY E 
ABCDE = DEX = ABY =CxXY E 


If we take for each linear function of the yields the alias involving the smallest possible 
number of letters, but remembering that 2, y are pseudo-factors, so that X, Y and XY are of 
18-2 











266 Confounding and fractional replication in factorial experiments 


equal importance and therefore X Y should be regarded as a main effect and not an inter- 
action, we have the following allocation of contrasts to the three components of the analysis 
of variance: 
Blocks: . Se fe eZ 
Treatments: A, B, C, D, E. 
AB=0X, AC=BZ,. BC-=AX, 
CD=EY, DE=CY, CE=DY. 
AD, BD, AE, BE. 


Error: AY, BY, DX, EX, AXY, BXY, UXY, DXY, EXY, ACD, BCD, ACE, BCE. 


The four three-factor interactions could equally well be regarded as interactions between 
two-factor interactions and blocks. It would be anticipated that these would be smaller 
than the interactions of main effects and blocks. The purpose of the present exposition is 
to give a clear statement of the possible interpretations of the results of an individual 


experiment. Further remarks on the problem of interpretation are postponed to a later 
section in the paper. 


AN EXAMPLE OF FRACTIONAL REPLICATION WITH CONFOUNDING 


A design which has proved of practical utility is the half-replicate of a 2° experiment arranged 
in four blocks of eight plots. 


Call the factors ¥,, ¥2, Ys; Yas Ys) Y¥g- Then the best confounding is that in which, using full 
replication, the block differences are all third-order interactions, say 


YrY2Ys¥e Ys¥aIsYo ANd YiYeYsYe- 
But it is impossible to keep main effects and interactions clear with this confounding, 
whatever interaction is equated to the identity. 
If we take the confounded interactions to be of the type 


YWrY2¥s, YsYaYs: Yro¥aYs, 


and the interaction y,¥243Y,¥5Y_ to be unity, then the following interactions are also 
confounded: 

YsYsYor YiY2¥e ANd Ys¥e. 
It will be found by enumeration of the possibilities that one first-order interaction must 
be sacrificed. All main effects and the other first-order interactions will have high-order 
aliases. 

It is interesting to examine this design in the same way as the 25 above for the relations 
between block-treatment interactions and treatment interactions. 

There are, in fact, only thirty-two independent contrasts, and it is simplest to enumerate 
these by operating on the identity relationship with the thirty-two possibilities for the 2° 
system omitting y,. As before, we insert block pseudo-factors. For simplicity of printing we 
use A, B, C, D, E, F for the factors and X, Y for the block factors. Then 


I=ABCDEF, X=ABC, Y=CDE, XY =ABDE, 
and combining these into one relationship, we have 


I = ABCDEF = ABCX = CDEY = ABDEXY = DEFX = ABFY = CFXY. 








> 
° 


tw 


iter- 
lysis 


nged 


y full 


ding, 


: also 


must 
order 


tions 
erate 


the 25 
ng we 


O. KEMPTHORNE 267 
A complete table of the aliases for this design follows: 








= ABCDEF=ABCX =DEFX =CDEY =ABFY = ABDEXY =CFXY 
=BCDEF =BCX = ADEFX =ACDEY =BFY =BDEXY =ACFXY 
=ACDEF =ACX = BDEFX ‘=BCDEY =AFY =ADEXY =BCFXY 
=CDEF =CX = ABDEFX = ABCDEY=FY = DEXY = ABCFXY 
= ABDEF = ABX =CDEFX =DEY =ABCFY =ABCDEXY=FXY 
C =BDEF =BX = ACDEFX =ADEY =BCFY = BCDEXY =AFXY 
=ADEF =AX = BCDEFX =BDEY =ACFY = ACDEXY =BFXY 
Cc =DEF = = ABCDEFX =ABDEY =CFY =CDEXY =ABFXY 
=ABCEF =ABCDX =HFX = CEY =ABDFY =ABEXY =CDFXY 
=BCEF =BCDX = AEFX =ACEY =BDFY = BEXY = ACDFXY 
=ACEF =ACDX BEFX =BCEY =ADFY = AEXY = BCDFXY 
D =CEF = CDX = ABEFX =ABCEY =DFY = EXY = ABCDFXY 
=ABEF =ABDX =CEFX = HY = ABCDFY =ABCEXY =DFXY 
(D =BEF = BDX =ACEFX =AEY =BCDFY =BCEXY =ADFXY 
D =AEF = ADX = BCEFX = BEY =ACDFY =ACEXY =BDFXY 
CD =EF = DX = ABCEFX =ABEY =CDFY = CEXY = ABDFXY 
=ABCDF =ABCEX =DFX = CDY =ABEFY =ABDXY =CHFXY 
=BCDF =BCEX = ADFX =ACDY =BEFY = BDXY = ACEFXY 
=ACDF =ACEX =BDFX =BCDY =AEFY = ADXY = BCEFXY 
zE =CDF = CEX = ABDFX =ABCDY =EFY = DXY = ABCEFXY 
1 =ABDF =ABEX =CDFX = DY = ABCEFY =ABCDXY =EFXY 
(CE =BDF = BEX =ACDFX =ADY =BCEFY =BCDXY =AEFXY 
EK =ADF = AEX =BCDFX =BDY =ACEFY =ACDXY =BEFXY 
CE =DF = EX = ABCDFX =ABDY =CEFY = CDXY = ABEFXY 
IE =ABCF =ABCDEX =FX = CY =ABDEFY += ABXY = CDEFXY 
zk =BCF = BCDEX =AFX = ACY =BDEFY =BXY = ACDEFXY 
KE =ACF = ACDEX =BFX = BCY =ADEFY =AXY = BCDEFXY 
DE =CF =CDEX =ABFX. =ABCY =DEFY =XY = ABCDEFXY 
DE =ABF = ABDEX =CFX = Y = ABCDEFY=ABCXY =DEFXY 
(DE =BF = BDEX =ACFX =AY = BCDEFY =BCXY = ADEFXY 
DE =AF =ADEX =BCFX = BY = ACDEFY =ACXY = BDEFXY 
BCDE = F = DEX =ABCFX =ABY =CDEFY =CXY = ABDEFXY 


The partition of the degrees of freedom in the analysis of variance which would generally 
be made is the following: 


D.F. 


Blocks 3 
Treatments: Main effects 6 

Interactions 14 
Error 8 


31 
The table of aliases is condensed below by the omission of all aliases involving more than 
two factors—counting, as before, X Y as a single factor as well as X and Y. 


Effects A, B, D, E have aliases of at least three letters, but C = FXY and F = CXY. 
Effects AD, BD, AE, BE have aliases of at least three letters, but 


AB=CX=FY, AC=BX, BC=AX, CD=EY, EF =DX, 
CE=DY, DE=FX=CY, DF=EX, BF=AY, AF= BY. 


In an experiment in which block-treatment interactions cannot be assumed to be negligible 
in relation to the effects it is desired to estimate, the interpretation of most two-factor 
interactions is difficult if not impossible. The following identities of practical interest exist 
for the terms which would be used to estimate the error: ACD, BCD, ACE, BCE have 
aliases of three letters and are either three-factor interactions or interactions between blocks 


and two-factor interactions, but ABD= EXY, ABE=DXY, ADE = BXY, and 
BDE = AXY. 


BAR WOR RRR RRR RES SSSSSSSSss 








268 Confounding and fractional replication in factorial experiments 


This design is very similar in result to the fully replicated but confounded 2° design 
described above. 


FRACTIONAL REPLICATION IN THE 3” SYSTEM 


Here we have to consider treatment effects assessed from powers of one-third of a complete 
replicate. Only those treatment combinations represented by points of the lattice lying on 


the hyperplane Oy Yi +%g¥g+-..+%,y, = 0, or 1,or2 (mod 3) 
will be included in a one-third replicate. 

A particular treatment effect is given by the differences between the means of those plots 
represented by points on the following three planes: 


BY + Ba¥2t+---+hnY¥n =9 (mod 3), 
Ai Yi + Boat ---+Bn¥n = 1 (mod 3), 
By¥1+ Bo¥at---+2n¥n = 2 (mod 3). 
It is obvious that the points lying on the first plane will also lie on the planes 
(By + Aay) ¥y + (Ag+ Actg) Yat -~ + (Bn t+A®y)yn = 9 (mod 3), for A= 1 and 2; 
the points on the other two planes will lie on these planes with 1 and 2 respectively on the 
right-hand side of the equation. 
The aliases of each pair of degrees of freedom are therefore obtained by multiplication of 
its symbol by viv 1 YR, 
and by its square. 
As an example, suppose a third sitet of a 3° design is based on the inclusion only of 


those treatment combinations represented by the symbol. y, y,y3(y; + Y2+ Ys = 0 say), then 
the aliases are exemplified by the relationship y, = y, yy? = y2y3. 


THE CONFOUNDING OF ONE REPLICATE OF A 3° EXPERIMENT IN THREE 
BLOCKS OF NINE PLOTS 


A frequently used design is the 3° in three blocks of nine plots, testing all combinations of 
three factors each at three levels. This design is formally a one-third replicate of a 34 design. 
Suppose the factors are y,, y,, and y, and let blocks be denoted by the pseudo-factor 5; 
a three-factor interaction of y,, y,, and y3, Say ¥;¥2Ys, is usually confounded in order to keep 
‘main effects and first-order interactions free of block effects. 

Then b = y, yy, or J = y, y,y35", since b* = 1. 

As in the case of the 2° design, we work out the aliases of each pair of degrees of freedom: 
each pair of degrees of freedom will in this case have two aliases: 


Yr. = WiY2¥3> = yeysd* Yo¥s = Yiyiygl® = y,b* 

Yo = Yri¥i¥s3 = yiy3b" Yo¥s = Yi y30> = y,y3b* 

Ys = Yi¥ay3® = y,y2b* YrY2¥s = YiY2¥sb = b* 
YiYe = Yiy2y3o = y,b* YiYoY3 = YiY2d = y3b 
WYi=NyYs> = yeyZb WY2Ys =YiYsb =yeb 
WYs =VUYi¥s> = y.b* MYi¥s = yb = Y2¥36 


Yi¥s = 4.435 = yoyZd? 





lots 


the 


n of 


y of 
chen 


ns of 
sign. 
or b; 
keep 


dom: 





O. KEMPTHORNE 269 


Here again the identities could result in difficulty in interpretation—as of course could 
have been predicted from the examination of the possible arrangements in blocks of nine 
of the 3* design. The main effects may be regarded as clear, and three of the first-order 
interactions. The remaining two-factor interactions could be ascribed to differential effects 
of the factors on the three blocks. The three-factor interactions which are not confounded 
with blocks are also ascribable to interactions of main effects and blocks and may therefore 
be used to form an estimate of the error of these effects. 


GENERAL REMARKS ON CONFQUNDING 


The device of confounding is used almost without exception in agricultural experiments in 
order to reduce the block size to twelve or less plots. As the above results indicate there are 
two aspects which then need careful consideration, (a) the estimation of interactions, and 
(b) the estimation of the experimental error. 

The main purpose of the factorial design is the estimation of main effects and interactions 
between pairs of factors and thence of the effect of any one factor in the presence and absence 
of each of the other factors. It is clear that when it is necessary to remove soil heterogeneity 
by confounding, the interpretation of a small experiment involving a few factors may be 
exceedingly difficult because of the possibility of block-treatment interactions. It is possible 
to use the rule that a large contrast should be regarded as the interaction between 
whichever pair of main effects is the larger, but this rule will break down in some cases when, 
for example, the contrast has two aliases AB and CD, and effects A and C are large and 
Band D small. In the case of a series of experiments, a device which might be helpful is the 
use of permutations of the possible identity relationships, one at each centre. The modern 
emphasis in agricultural experimentation is on series of experiments at various places and 
in several years, rather than on individual experiments. Interactions of pairs of factors 
will be estimated correctly from a large series of experiments if treatments are assigned 
at random to blocks. 

The evaluation of two-factor interactions for individual experiments depends on the 
assumption that block-treatment interactions are small compared with the experimental 
error. Yates (1935) examined several experiments for the existence of such interactions and 
found no evidence of them. Since that time a large number of experimental results which 
can be used to provide information on the question have been accumulated, and an investiga- 
tion of these has indicated that block-treatment interactions are negligible and may be 
ignored (Kempthorne, 1947). 

With regard to the estimation of error, in so far as tests of significance are of interest, it can 
be said that the analysis of variance does provide a test of significance of the hypothesis that 
the treatments have an overall effect different from zero. In agricultural experimentation, the 
term error is used to denote block-treatment interactions. Thus in the simple randomized 
block experiment, it is possible to evaluate the difference between two treatments from each 
block, and it is the variability of this difference from block to block which is regarded as tlie 
error. In general, as there are usually few blocks, and the error of each comparison would be 
determined with poor accuracy, the errors of all the possible independent comparisons 
are pooled to give a common estimate. If the treatments were duplicated at random 











270 Confounding and fractional replication in factorial experiments 


within each block, the analysis would be of the form (r being the number of blocks and 
t of treatments): 


D.F. 
Blocks r—1 
Treatments t—1 
Treatments by blocks (r—1) (¢-—1) 
Within blocks rt 
2rt—1 


The component ‘within blocks’ could more accurately be described as experimental 
error, but would not be used to evaluate the errors of treatment effects, since the experi- 
menter is interested in the constancy of treatment effects from block to block. There is 
therefore little point in actually carrying out such an experiment. In a factorial experiment 
with replication, the components which could be evaluated consist of replicates, effects and 
low-order interactions, high-order interactions, and interactions of treatments and repli- 
cates. On the assumption that the sum of squares for interaction of treatments and 
replicates is homogeneous, the mean square for high-order interactions will include the mean 
square for treatments x replicates plus a component of variance due to high-order inter- 
actions. When only one replication is used, it is assumed that the component of variance due 
to high-order interactions is small, and thai the high-order interactions mean square can be 
regarded as an estimate of error. It is important to bear in mind that an individual agri- 
cultural experiment can give information only for a particular set of experimental conditions 
and that it is known from experience that place to place and year to year variability is 
considerable. It would therefore be uneconomical to utilize available resources to determine 
effects and their errors at a few particular places very accurately, but preferable to sacrifice 


replication at each place in order to have information over a large range of experimental 
conditions. 


MIXED SYSTEMS 


It is not proposed to examine mixed systems of the type pq", where p and q are primes, in’ 


the present paper. It is clear, however, that the possibilities of complete confounding and 
fractional replication are very limited. A p’th replicate must obviously include p™”~ 
combinations of the m factors combined with all the g™ combinations of the n factors. For 
the examination of treatment aliases the system may be regarded as the product of the two 


separate systems. Thus if p = 3, m = 2, g = 2, n = 3 and the factors are y, y,y3y4y5, then 


a half replicate would be obtained by putting I = y3y,y;. The aliases which result are 
exemplified by the following: 


Yr = WiYaVays> YrYals = YiYa¥aYs, 
YiYa = ViYa4aVays, Ys = Yas: 
Such designs with fractional replication or complete confounding are therefore useful only 
when the corresponding designs for the two separate systems are feasible. 


COMMENTS ON ‘THE DESIGN OF OPTIMUM MULTIFACTORIAL EXPERIMENTS’ 
In a paper entitled ‘The Design of Optimum Multifactorial Experiments’, Plackett & 


Burman (1946) put forward designs more specifically for physical and industrial research, 
which are of interest from the point of view of fractional replication. In order to estimate the 





eff 


y 
a 
e 
8 
Q 


a ae. a Y 


nd 


and 
ym—l 
For 
two 
hen 

are 





O. KEMPTHORNE 271 


effect of varying nine components, of an assembly, each component having two possible 


values, a nominal (—) and an extreme (+), they put forward the following design which 
requires the testing of sixteen assemblies: 


Components 

1 2 3 4 5 6 7 8 9 
Assembly 1 + - - - + - - + + 
2 - + - - - + - - + 
3 r + + - = = + sp a 
4 + - + + - = * - 
5 = + + + + aa = = + 
6 + - + 7 + + - - - 
7 - + _ + + + + Yi — 
8 + os + - + + ¥ + — 
9 + + _ + = + “+ + + 
10 - + + _ + = + + + 
ll - - + + _ + = + + 
12 + - - + + - + - + 
13 - + - - > + = + = 
14 - + - - + + = + 
15 = - = + = = + + = 

16 _ -- - - ~ - 


Yates put forward a similar design in his 1935 paper for the weighing of a number of small 
articles on a balance which required a zero correction, as an example of the estimation of the 
effects of independent factors. In his case there was a close formal analogy to the 2” factorial 
system, and it will now be shown that Plackett & Burman’s design given above is a high- 
order fractional design of the type discussed in the present paper. 

Denoting the nominal values by unity and the extreme values of the nine components by 
a, b, c,d, e, f, g, h, k in order, the treatment combinations represented are 1, aehi, abfi, abcg, 
abedh, bedei, acdef, bdefg, acefgh, abdfghi, bceghi, cdfhi, adegi, befh, cfgi, dgh. It is found merely 
by one-by-one examination of the three-factor interactions that all the above sets of treat- 
ment combinations occur with the same sign in the following: 

ABE, ACK, BCF, CDG, DEH. 

The same will be true for all the members of the Abelian group of which the above five-inter- 
actions are generators. The identity relationship is therefore: 

I = ABE = ACK =BCEK =BCF =ACEHEF =ABFK = EFK 

=CDG = ABCDEG = ADGK = BDEGK = BDFG = ADEFG = ABCDFGK =CDEFGK 


=DEH =ABDH = ACDEHK = BCDHK = BCDEGH = ACDFH = ABDEFHK = DFHK 
= CEGH = ABCGH =AEGHK =BGHK =BEFGH =AFGH = ABCEFGHK = CFGHK 


The identities of interest to the experimenter are the following: 
I = ABE = ACK = BCF = EFK = CDG = DEH; 
from these we derive the following aliases for main effects: 


A= BE =CK, F = BC = EK, 
B=AE=CF, G=CD, 
C=AK=BF=DG, .H=DE, 
D=CG = Ed, K= AC = EF. 


E = AB= FK = DH, 


In all cases, the contrasts estimating main effects are minus the contrasts estimating inter- 
actions. If, for example, the interaction of B and £ is negative, and A has no effect, the 











272 Confounding and fractional replication in factorial experiments 


conclusion drawn by the experimenter will be that A has a positive effect. It is possible but 
rather difficult to imagine physical systems in which effects will not interact, and interpreta- 
tion of the results of experiments based on this design may often be impossible. With nine 
factors, it appears from the present work that the minimum number of combinations which 
should be tested is 128, that is one-quarter of a replication, though it is possible that by making 
less stringent assumptions about two-factor interactions, one-eighth of a replication might 
give intelligible results. A possible instance in which it might be feasible to use the designs 
discussed is when it is expected that only one or two of the factors have an effect, and the 
problem is to determine as quickly as possible which of the nine factors are responsible. 
An example in which a high-order fractional design was used in such circumstances with good 
results has been described by Tippett (1936). A detailed examination of all the designs put 
forward by Plackett & Burman will not be undertaken, but the lines on which such an 
examination would proceed and the broad conclusions which would emerge are obvious 
from the above examination of one of their simpler designs. 


CoNCLUSIONS 


A method of examining fractional replication and confounding for some types of factorial 
experiments is described. The formal equivalence between the two is indicated and the 
implications of this equivalence discussed. Further progress will follow on group theory 
lines and this is being examined, together with the possibility of fractional replication when 
the fraction is greater than unity. The possibilities are explored of the estimation of main 
effects and two-factor interactions of many factors by testing only a small proportion of 
the possible treatment combinations. An examination on these lines is made of designs 
proposed by Plackett & Burman. 


REFERENCES 


Finney, D. J. (1945). The fractional replication of factorial arrangements. Ann. Eugen., Lond., 12, 
291-301. 


Fisuer, R. A. (1942). The theory of confounding in factorial experiments in relation to the theory of 
groups. Ann. Eugen., Lond., 11, 341-53. 

KemprHorne, O. (1947). A note on differential responses in blocks. J. Agric. Sci. 37, 245-48. 

PxacketT, R. L. & Burman, J. P. (1946). The design of optimum multifactorial experiments. Bio- 
metrika, 33, 305-25. ; 

Tippett, L. H. C. (1936). Applications of Statistical Methods to the Control of Quality in Industrial 
Production. Manchester Statist. Soc. ‘ 

Yatss, F. (1935). Complex experiments. J. R. Statist. Soc. Suppl. 2, 181-223. 


Yates, F. (1937). The design and analysis of factorial experiments. T'ech. Commun. Imp. Bur. Soil Sci., 
no. 35. 





2) 
ee 


on npn eos @ ws 


amt tans al» tute 


f 


l 





[ 273 ] 


A COMPARISON OF STRATIFIED WITH UNRESTRICTED RANDOM 
SAMPLING FROM A FINITE POPULATION* 


By P. ARMITAGE, B.A. 


1. INTRODUCTION 


1-1. We are concerned in this paper with the problem of estimating the mean value pz 
of a variable z in a population, by taking a sample which is in some way representative of the 
population. It has been realized since Bowley’s paper (1926), and more particularly since 
Neyman’s more comprehensive survey (1934), that a certain degree of precision in the 
estimate can often be obtained more economically by stratified random sampling (usually 
referred to merely as stratified sampling) than by unrestricted random sampling (usually 
called merely random sampling). In the stratified method, the population is divided into 
several strata, the sample size divided in some prearranged way among the strata, and 
sampling performed at random from each stratum. In unrestricted random sampling, a 
random selection is made from the whole population, and the method may be regarded as 
a particular case of stratification, where the number of strata is one. 

Some text-books deal briefly with stratified sampling. Wilks (1943) considers only 
infinite populations, and denotes by representative sampling what we should call a par- 
ticular type of stratified sampling (see § 1-2). The subject is treated by Kendall (1946, 
pp. 249-52), but he makes no comparison with unrestricted random sampling. We shall 
begin by introducing several well-known results which will be needed later. 


1-2. The summation sign > will be used throughout for z , and > for z . In general, 
i=1 k k=l 


> is used for a single summation, })> for a double summation, and the suffix k where no 
K 


summation is invoived. 

We shall consider the following position: A population 7 of size N is subdivided into r 
strata, 77,, of size N, (XN, = N). The variable z is distributed so that the mean and variance 
(divisor N,,) within 7, are respectively 4,, o3. It is required to estimate py = > Nyw/N, the 
grand mean. 

Suppose a given sample size, n, is divided so that , items are sampled at random from 
7, (Xm = n). We may denote the jth observation from the kth sample by x,,; (j = 1, 2, ..., ,), 
and the mean and variance of the kth sample by %, and sj, which are known to be unbiased 
d (m,— 1) N,o% 

(Nj, — 1) 
It seems intuitively obvious to take as our estimate of , 
rep =N,%/N ? (i) 
which is clearly unbiased. This is, however, not the only unbiased estimate which is a linear 
function of the x,;. For instance, >N,x,/N also satisfies the conditions. Neyman (1934) 
has shown that, for fixed values of n,, the estimate given by (i) is the best linear unbiased 


estimate of uw, in the sense that its sampling variance is less than that of any other linear 
unbiased estimate. 


estimates of “4, an , respectively (see, for example, Kendall, 1943, p. 284). 


* Communication from the National Physical Laboratory. 








274 Comparison of stratified with unrestricted random sampling 


The question now arises: given a sample size , how shall we choose the n, so as to minimize 
var (m), where m is given by (i)? Bowley had not considered ‘best’ estimates, and he sug- 
gested that n, should be proportional to N,, i.e. 


nN, wo 
ny, = TT (ii) 
Neyman (1934) showed, by the method given in §2, that the values of n, which minimize 
wan 1, = MMe VtNel(Ne— 1) 
6 TMM 1] 
a nN,0;, ess 








where 0}, = 0% ¥[N,/(N,—1)]. 

We shall refer to these two methods of defining the n,, by (ii) and (iii) respectively, as 
proportionate sampling, and optimum stratified sampling, denoting by m, and m, the estimates 
of « obtained from (i) by the two methods, and by Z the estimate of w given by the mean of 
an unrestricted random sample of n from the whole population 7. 

The optimum stratified method thus requires a knowledge of thea;,. In practice, we should 
never know the o;, exactly, unless the population had been subjected to exhaustive sam- 
pling, in which case 1 would be known exactly. Sukhatme (1935) has shown that, at any rate 
for large N,, if the o? are estimated from a preliminary sample, and the n, defined by using 
these estimates in (iii), there is a high probability that var (m,) < var (m,).* The efficiency 
of this method will of course depend on the size of the preliminary sample, and Sukhatme’s 
investigation only dealt with one value of this (15 from each stratum). In some cases we 
should be able to form a fairly good estimate of the 7, from past experience, and there would 
be no need for a preliminary sample. 

Another interesting comparison which has not been extensively investigated is that 
between optimum stratified sampling and unrestricted random sampling. Wilks (1943) deals 
with this for infinite populations, and obtains (pp. 88, 89) the result (in our notation), 


var (m,) < var (m,,) < var (2), (iv) 


the first equality holding only when all the o;, are equal, and the second only when all the 
fy, are equal. (Our N,/N are replaced by p,, where p, is the probability that x, when drawn 
at random from 7, is a member of 7, so that, for instance, (iii) becomes 


NP. TK 
UP 


Representative sampling as defined by Wilks is what we should call proportionate sampling.) 
We shall show in § 2 that for finite populations, while the relation 





ny = 


var (m,) < var (m,) (v). 


is always tiue, the equality holding only when all the o}, are equal, it is not necessarily true 


that var (m,,) < var (Z), (vi) 


* No confusion need arise from the fact that the symbol m, and the term optimum are still used 
when estimates of the ¢, are used in (iii). 





a< we oO 


= o 





P. ARMITAGE 275 

and in fact in the limiting case when all the 4, are equal, it is true that 
var (m,,) > var (2), (via) 
so that if the a}, are also equal var (m,) > var (Z); (vib) 


i.e. random sampling gives a more accurate estimaté o1 the mean than any stratified sampling. 
We shall see, however, that in almost all practical cases (iv) is true. 


2. DERIVATION OF FORMULAE 
2-1. Results (iii) and (v). Using the notation of § 1-2, we have the standard result that 











var (Z,) = 7 (3) (see e.g. Wilks, p. 86). (vii) 
Therefore from (i), var (m) = > val (x *) 
= 
No? sai 
= Dye, md. (viii) 


The result (iii) may be obtained quite easily by finding the values of the n, which minimize 
(viii) subject to the condition Sn, = n, using the method of Lagrange multipliers. Then, 
substituting (ii) and (iii) in (viii), and applying Schwarz’s inequality, we have (v). The 
following method is due to Neyman. 

It may be verified from (viii) that 

No; XNo\* 1 


N- . 1 F No}\2 F 
var (m) = an ENO ag Eat A — Wa EM(ei-* Ne) - =) 





If we denote the three terms of (ix) by A, B and C, so that 
var(m) = A+ B-C, 


it will be seen that A and C are independent of n, and, since B is non-negative, it follows that 
the values of n, which minimize var (m) must minimize B. Now B = 0 if and only if 
a nN;,0;, 

‘EN’ 


which is (iii). For these values of n,, m = m,, and 


var (m,) = A-—C. (x) 
If we define n, by (ii), so that m = m,, we see from (ix) that B = C, so that 
var (m,) = A. (xi) 


From (x) and (xi), we obtain (v), the equality holding only when C = 0, which is true only 


=u 


when o;,— —— = 0 for all k, i.e. when the o}, are.all equal. 
hen 0}, i for all k, i hen the oj, li ] 





2-2. Unrestricted random sampling. The variance of a random observation x from 7 is 


o = 2Net, 5, 








276 Comparison of stratified with unrestricted random sampling 


— y\2 
where S is the weighted sum of squares of the , i.e. S = 2a e) . From (vii), 


var (%) = — aes — Waa) 








N-1 
N-n N-n 
= aN) =* + aan 8 
N-n N-n 
~ mN(N— nN(W—1) 2 - o?+T— n(N — n* 
; N-n 
From (xi), var (m,) = Tin No? 
Denoting om a a by H, an a =o? by K, we have 
N- = 2 
., N-n, . N-a ,) 
var (Z) = Nn H+ a5 | ‘ 
N { (xii) 
and var (m,) = — K. | 
LNof- XN, de? 
Now H-K= Wu —1) <0, 


and if we regard each N, as being of the same order, O(N), then H — K is O(N-), which means 
that when all the ~, are equal, S = 0, and so 


var (Z) < var (m,), 
which is (via); butas Noo, var(%)~var(m,)+ S/n, (xiii) 
giving Wilks’s result (p. 88) that for infinite populations (vi) is true, the equality holding 
only when all the 4, are equal. 
From (v) and (via) it follows that for finite populations, when all the y, are equal and all 


the oj, are equal, (vib) is true, i.e. in this case unrestricted random sampling is actually 
better than any stratified random sampling with the same sample size. 


3. GENERAL COMPARISON 
3-1. From (ix), (x) arid (xii), 
¢ = var (%) — var (m,)— arn S= ure Da (P-—Q-R), (xiv) 
where P = N*YNo?f-N(=TNo)?>0 (equality if all o; are equal). 
Q = n( No? —-N Yo?) <0, 
R = N*Yo? — (No)? > 0. 


As Noo, P, Q and RB are respectively O(N*), O(N) and O(N®), and so we have the result 
that for infinite populations ¢ > 0, which with (xiii) is easily seen to be equivalent to Wilks’s 
result (iv). 

In the finite case, however, by suitable choice of the oj and n we can make ¢ either positive 
or negative. For instance, it the o} are all equal and n is sufficiently small, R predominates 
in (xiv), and ¢.< 0. As n increases to N, ¢ increases to 0. (By considering Q and R, it is not 

















xii) 


ans 











P. ARMITAGE 277 


obvious that ¢— 0 in this case, but it must be remembered that (xiv) is only true if the n, 
are given by (iii), and this becomes impossible as n approaches NV. This will be remarked 
upon below.) If the o; are sufficiently unequal, P will predominate and ¢>0. In this case 


| hae Eee ee ae . . 
the factor NW —1n nin (xiv) will be positive, and ¢ will decrease as n increases. 


The situations, then, in which (vid) is likely to be true (provided that the n, are really 
given by (iii)) are when the 4, are nearly equal, and when N is small or the o;, are nearly 
equal. We shall consider some examples in § 4. 


3-2. In applying the procedure of stratification, we shall make two departures from the 
theory outlined above which will tend to nullify the advantages of the stratified method. 
The first is that, as was pointed out in § 1-2, we shall never know the o, exactly, and the 
degree to which our estimates from which the n, were obtained are accurate depends on the 
circumstances. It seems quite likely that Sukhatme’s result will be tairiy well applicable to 
finite populations, but there is an opportunity for research on this point. 

The second respect in which we depart from theory lies in the fact that, even if the o, are 
exactly known, the n, that we choose can never be exactly as given by (iii); first because they 
must be integers, which makes a considerable difference when n is small (the size of the 
smallest stratified sample from which an unbiased estimate of » can be made is clearly r); 
and secondly, n, cannot take values greater than N,,. In this latter case, if the values of, say, 
8 of the n,, as given by (iii), are greater than the corresponding N,, we should let n, = N, 
for these s strata, and then set the other (r — s) values of n, proportional to the corresponding 
N,,7;,- This will clearly decrease var (%)—var(m,) as given by (xiv). For example, when 
n = N, we have (N—n) 

var (%) = var(m,) = —.—._ S = 0, 


(N-—1)x 
but the right-hand side of (xiv) 
1 
= ya NENOP—(ENoH}>0 


(equality holding if all the o; are equal). In fact both these limitations will decrease the 
theoretical advantage (if any) of stratified over random sampling, and we must take them 
into account in assessing the relative merits of the two methods. 


4. EXAMPLES 


In the four examples illustrated by Figs. 1-4, var (m,) and var (%) have been calculated for 
different stratified populations, and y = log,) {var (%)/var(m,)} plotted against c = n/N, 
so that y% <0 if var (%)<var(m,). In each figure the different curves represent populations 
with the same o;,, with the N, in the same proportions but with different magnitudes, and 
with the 4, equal, so that S = 0. 


Example 1. o; = 2,3,4, N,, x 6, 4,3 (N = 65, 26, 13). 
Example 2. o;, = 4, 5,6, N, 6, 5, 4 (N = 120, 60, 30, 15). 
Example 3. 0, = 4,5, 6, N,3,11,4 (N= 126,54,36, 18). 


Example 4. 0; = 1,1,2,3,4, N,0c5,5,1,2,3 (N = 128, 32). 


The first thing to be noticed about the graphs is that in each one y increases, generally 
speaking, as n increases. Further, in any one example the range of c for which y < 0 increases 








278 Comparison of stratified with unrestricted random sampling 








0-3p , 
Fig. 1 li 
Ij 
----~ N= 65 ii 
eeranew r, 
ji 
non Te. ti 
0-2) th 
il 
(1 
Afi 
4/! 
47 ; 
{i / 
/ 
i 
4 : 
0-1 Pe 
i if 
Pf x 
Fail F / 
> ag i 
y nf ied gan 
\ Kf nd hoe / c 
> a / A u i i 
0¢ "i 1 j WF r 1-0 
'\ A 
u \ H r 7 
\ 
‘ends tele 
\ / 
eo 
—0O-IL 
0-2, Fig. 2 











4 = ~ 4 
4, RP te ] 10 
H nul \ N\ f / Pi / 
am Wy a / 
tr | + I~ 
ret gy A, / 
1 i\ Oe 
iJ \ / a 
Tass 
-0-] % j 


0-1 


Fig. 3 





-0-1 








—0-2 








N=126 
—-—-- N=54 


Pe 2 iar er N=36 
——— 














eo = © BD —_ oF 


re SS hel 





P. ARMITAGE 279 


as N decreases; and in this sense we can say that for small samples of proportionate size 
from a stratified population, the advantage (if any) of the stratified method decreases as 
N decreases. 

Secondly, the curves are not smooth. The reason for this is clear. In the optimum stratified 
method the n, are to be chosen approximately proportional to N,,c,, (a second approximation 
is (N+ 4)o,). In Example 1, the N,,o;, are all equal, and it follows that the n, should be 
nearly equal. If n =0 (mod 3) this can be done, but for n = 1, 2 (mod 3), var (m,) takes values 
greater than it would if fractional n, were allowed. This produces a rise in the curve of y 
for n =0 (mod 3), which gradually disappears as n increases since the effect is much greater 
for small n. The same ‘period’ is noticeable in Fig. 2, but in Figs. 3 and 4, where the main 
‘periods’ are respectively 15 and 30, the effect is smaller. 

We saw in §3-1 that, broadly speaking, the advantage of the stratified method decreases 
as the o, tend to equality. This is illustrated by comparing Examples 1 and 2. In each of 
these the No, are equal, but in Example 2 the o,, are proportionally more nearly equal. 
Comparing curves for about the same N (N = 65, 26, 13 in Fig. 1 with NW = 60, 30, 15 in Fig. 2), 
we see that in Fig. 2 the range of values of c for which y < 0 is greater than in Saale a 

Fig. 3 has the same o;, as Fig. 2, but the N,.o;, and therefore the n,, are different. The 
curves are similar to those of Fig. 2, but the stratified method is still less advantageous 
(especially for small values of c). 

Example 4 has five instead of three strata, and there is quite lange variation between the 
o, and between the N,,c;,. There is no doubt here that y>0, the only exception being for 
N = 32, n = 5, where wy = — 0-02. 

These examples may be said to give the maximum advantage to the stratified method, in 
the sense that the calculated values of var (m,) depend on the best method of choosing the n,. 
If the o;, are not sufficiently well known to enable the best values of n, to be used, then we 
shall get a larger value of var(m,). It must be remembered, however, that in all these 
examples we assumed that there was no variation between the ,, a situation which would 
be very unlikely to occur in practice. Now it is clear from (xii) that if the same N, and o;, 
are considered as in one of the above examples, but the 4, are now unequal, the effect is to 
increase the value of var(%) by (N—n) S/(N—1)n, where S = >N(m— )?/N; so, in any 
example where y < 0 for some particular values of N and n, we can reverse the direction of 
the inequality by choosing a sufficiently large value of S, say 


S, = [var (m,)—var (%)] (N —1)n/(N —n). 


In comparing different values of S, for different examples, it must be remembered that the 
order of magnitude of S, depends on the o;, and a suitable measure of comparison will be 
S,/o2, where o? is the pooled variance within strata = }N,o?/N. 

In Example 1, the largest vaiue of S, is for N = 13, n = 4. Here var(m,) = 1-9172, 
var (%) = 1-5577, and S, = 1-917 = 0-231o2. (If v7, = 0, uw, = 2, ws = 3-5, then S = 2-066.) 

In Example 2, the largest value of S, is for N = 15, n= 4. Here var(m,) = 24-647, 
var (%) = 19-119, and S, = 28-14 = 0-28903. (If uw, = 0, “a = 7, My = 13, then S = 28-51.) 

In Example 3, the largest value of S, is for N = 18, n = 3. Here var(m,) = 46-235, 
var (%) = 30-523, and S, = 53-42 = 0-51503. (If ~, = 0, w, = 8, ws = 17, then S = 57-5.) 

In Example 4, the largest value of S, is for N = 32, n = 5 (the only occasion in this example 
where <0). Here var(m,) = 0-91406, var(%) = 0-87097, and S, = 0-2474 = 0-04903. 
(If 4, = #, = O and ws = wy = Ws = 1, then S = 0-285.) 


Biometrika 34 19 











280 Comparison of stratified with unrestricted random sampling 


5. CONCLUSIONS 
We have seen in §3 that optimum stratified sampling may give a less accurate estimate of 
» than unrestricted random sampling when the ,, are nearly equal, and when N is small or 
the oj, are nearly equal. The examples of § 4 bear out: these conclusions and show that the 
effect is greatest for small n, Fig. 3 providing an additional suggestion that if the products 
N,0; are widely different the advantage of the stratified method tends to be nullified. In 
practice, we should probably only apply stratified sampling if we knew that the strata were 
sufficiently distinct to ensure considerable variation between either the 4, or the o,. In 
the first case, if nothing much was known about the o, and a preliminary sample on the lines 
suggested by Sukhatme was impracticable, we should use proportionate sampling, and the 
size of S would usually ensure that var (m,) <var(%). In the second case, we should use 
*optimum stratified sampling, and rely on the variability. of the o, to ensure that 
var (m,) < var (%). Since an adequate degree of knowledge about the o, would be unlikely 
unless the N,, were quite large, we should in this case almost certainly be safe in using the 
method. To the above-considerations must be added the fact that if very inaccurate estimates 
of the o;, are used in (iii), then, whatever the nature of the population, the resulting procedure 
may be extremely inefficient. 

It must be realized, of course, that even if it were known that var (m,) < var (%), it would 
not follow that the optimum stratified method would necessarily be the most convenient. 
It may be impossible, or at any rate inconvenient, to do any sort of random sampling, and 
some sort of quasi-random sampling may have to be used (see. e.g. Madow & -{adow, 1944), 
but if the principle of random sampling is applicable the stratified method is not likely to 
be much more inconvenient, and in fact in most cases will be more convenient, than the 
unrestricted method. 

SuMMARY 
The stratified method has been used in the past almost solely for large-scale social and 
agricultural surveys. Here the stratum sizes are large, and known results for infinite popula- 
tions apply. There seems no reason why stratified sampling should not be used’to advantage 
for smaller populations, and it is important to know to what extent these results still apply. In 
this paper a comparison has been made with unrestricted random sampling in the usual case 
where we are interested in estimating the mean. The advantages of the stratified method are 
modified, but in most cases where the method is applicable it will be found to be worth while. 


The above work was carried out as part of the research programme of the National 
Physical Laboratory, and this paper is published by permission of the Director of the 


Laboratory. The author desires to acknowledge the assistance rendered by Mr D. V. Lindley 
who prepared the diagrams. 


REFERENCES 

Bowtey, A. L. (1926). Measurement of the precision attained in sampling. Bull. Int. Inst. Statist. 
22, lére livraison. 

KENDALL, M. G. (1943). The Advanced Theory of Statistics, 1. Griffin and Co. 

KENDALL, M. G. (1946). The Advanced Theory of Statistics, 2. Griffin and Co. ; 

Mapow, W. G. & Mapow, L: H. (1944). On the theory of systematic sampling. Ann. Math. Statist. 
15, 1-24. 

Neyman, J. (1934). On the two different aspects of the representative method. J. Roy. Statist. Soc. 
91, 558-625. 

SoxHatmeE, P. V. (1935). Contribution to the theory of the representative method. Suppl. J. Roy. 
Statist. Soc. 2, 253-68. 

Wis, S. S. (1943). Mathematical Statistics. Princeton University Press. 





tatist. 


tatist. 
. Soc. 


Roy. 





[ 281 ] 


SOME THEOREMS ON TIME SERIES. I 


By P. A. P. MORAN 
Institute of Statistics, Oxford University 


One of the principal problems in the theory of time series is to discuss the relation between 
two series, and in the present paper we prove a theorem by which we can test whether two 
such series are independent. Such a test of significance must depend on the models which 
we assume for the probability processes which generate the series. In practite, the two most 
useful models are, first, that of a moving average of a series of independent random com- 
ponents and, secondly, the solutions of linear stochastic difference equations. 
Let 5 M(t—1), n(t), 9(t+ 1), «-- 
be a sequence of independent random variables each distributed in the same distribution 
which we take to have zero mean and its second, third, and fourth moxents finite. Then the 
time series generated by N 
XW) = E ant—i) 


is a moving average with weights a;. On the other hand, consider a stochastic difference 
equation of the form = X(t) 4.a, X(t— i) +... +a, X(t—A) = y(t). (1) 
In order that the solution of (1) for successive values of ¢ shall form a stationary series it 

is necessary to impose the condition that the roots of the characteristic equation 
z*+a,2*-1+...+4, = 0 (2) 


shall all lie inside the circle | z| = 1 (Wold, 1938, p. 53). When this is true the solution of (1) 
can be shown to be ofthe form __ © 
X(t) = 2 a,9(t —2), 


where the a, are certain functions of the roots of (2). In this case > | a;| is majorized by a 
i=0 


convergent geomeiric series. 


Thus we see that both the above models are included in the more general one in which 
we define X(t) as given by e ; 
X(t}= 2, a; 4(t—%), 


whére the a; are any sequence of constants satisfying } | a,|<0oo. Now suppose 
i=0 
s+ O¢—1), St), S(¢+ 1), ... 
is another sequence of independent random variables having a distribution with zero mean 
and finite second, third and fourth moments. We write 


¥() = ¥ Aktt=) 


where S | 8; | <0o. To discuss whether two such empirical series of this form are correlated 
i=0 


we prove that the covariance é 
S= TX) Ye) (3) 


19-2 





282 Some theorems on time series 


tends, as n increases, to be distributed in the normal form about zero mean with a second 
moment which is a function of the a; and the £;. We shall discuss later the calculation of 
this second moment from empirical series, in which case some care is necessary. 

We first illustrate our method of proof by considering the much simpler problem of deter- 
mining the asymptotic distribution of the sum 


T., = x X(t-). (4) 


We shall show that this asymptotic distribution is also, under certain conditions, normal. 

This result is interesting because it establishes a central limit theorem (and therefore a law 

of large numbers) for stationary stochastic processes of this type. The law of large numbers 

for Markov chains has been considered by several writers, in particular Bernstein (1927), 

who proves his results by using central limit theorems for non-independent components. 

His thecrems cannot be applied in the present case, but some of the ideas of his methods can. 
Consider (4) above, where X(t) is defined by 


X() = ¥ anlt-i) 
i=0 


@ 
and > |«,| is convergent. There is no loss in generality in supposing that 
i=0 


: | a; | <4. 
i=0 
Clearly E(T,) = & ¥ a, E[n(t-s—i)) 
s=1 i=0 
= 0. 
Write o* = H(y), co = E[X(t)*], c, = E[X(t) X(t—s)]. 
Then Co = O7(aZ+aF+...), Cy = TAA +O, %,.3+..-), 


which are both clearly convergent. Moreover, 


R, = E(T2) = nE(X()]4+2'5. "S ELX(t-i) X(¢t-i-s)] 
i=1 s=1 
n-1 
= (ney +2 bY (n-i)e,). 
i=-1 i 


foe) 
n—R,, tends, as n increases, to R= (« +2 > «| 
i=1 


if this series converges absolutely. We shall show that lim n-'R, is finite. For R, is clearly 


(34) ° 


and this is finite. Moreover, we notice that n-1R,, is not greater than 


(Sia) 


@o 
We must now impose the condition that 5 a; is not zero. This condition is necessary to our 
i=0 








an 





P. A. P. Moran 283 


method of argument. If it is not zero, it may be assumed, without loss of generality, greater 
than a positive number. We now show that as n increases 


pri{t,(2R,,)* < T,, <t,(2R,)*} 


t 
tends, uniformly in ¢, and ¢,, to nt i) e* dt. 
t, 


) We require the following lemma (Bernstein, 1927, p. 12): 
Lemma I. Let Pn =2,t%n, 
where p,,, 2,, and o,, are random variables such that 


E(Z,) = E(o,)=0, E(Z3)=H,, E(o) = Hi, 


Then if, for n large, pri{t,(2H,)* <2, <t,(2H,)} 
4 
tends, uniformly in ¢, and t,, to na | e dt, 
ty 
then pri{t,(2J,,)* < Pn < t,(2J, )*}, 
where J,, = (p2) tends, uniformly in ¢, and ¢,, to 
4 
r+ e dt, 
by 
. we 
provided that lim -—* 0. 


Let € be an arbitrarily small number and choose N so large that 


« bo] 
x |a|<e S |a,;|<e. 
i=N «=0 


N 

Write X,(t) = DS a,nt-i), T= ¥ X,(t-2). 
i~0 s=1 

Then E(T’) =0, 

and write Ri = (PT). 


We shall prove that the distribution of 7’; tends to normality, i.e. that 
pr{t(2R,)'< T,, <t(2R,)%} 
t 
tends, uniformly in ¢, and ¢,, to rf e* dt. 
ty 
We first calculate R,, and R/, in another way. For 
n n @o 
T,= & X(t-s)= Y YX ay(t-—s-+) 
s=1 s=1 i=0 
= y(t — 1) + (aq + &,) lt — 2) +... + (At... +&__1) M(t—m) 
> ay (a, Het Osin—1) H(t = —-= n), 


ur and so R, = E(T2) = o7\a2+ (aq+a,)®+...+(ayt+...+%,_3)?+ E (@y + Ogu al 
s=1 











284 Some theorems on time series 
and this series converges. On the other hand, 
Ts, = ag(t—1) + (ay +0) 9(t—2) +... + (yt... + ya) (tN) 
+°S Got .--+ay) Mt-N—P) 
+ (04+... +ay) 9(t—n—1) +... +ayylt—n—N), 
and so Ri? = O%a8 +... (cg +4)? +... + (Ag+... + ayy)? 
+(n—N) (ag t... +ay)®+ (a+... + ay)? +... +a0%}. 


Since we have already supposed that > «, is positive, there exist positive numbers N, and 
i=0 


N 
d such that for all N>N,, > a,>d. If this is not true the theorem is in general false. For 
i=0 


suppose the distribution of the 7’s to be non-normal and write a = 1, a, =—1, a,=90 
(¢>1). Then the distribution of 7, does not tend to normality and its variance does not 


increase with n. We shall later show that this condition on the a’s is in fact satisfied for the 
solutions of stochastic difference equations. 


Now by the ordinary central limit theorem, as n increases, 


n-N 
T= 2, (+... + ay) y(t —N —p) 
tends to be distributed normally with zero mean and variance | 
Ry, = (n—N) (p+... +ay)?0%, 


that is pri{t(2R,)t< Ty <t(2R,)} 
; 4 
tends, uniformly in ¢, and t,, to rf e* dt. 
te 


Using Lemma I we see that the same is true, for fixed N, when we replace 7’, by Tj, and R, 
by R,. Now T, = T'+Q, 


say, where Q is what we get if we replace the sequence (a , a, ...) in 7, by (@y,,,---) and 
alter ¢, and from (5) we can choose N so large that for n> N, n-1H(Q?) <e, say. Taking a 


sequence €;, €,, ... tending to zero and choosing first N sufficiently large and then n and 
using Lemma I again, we see that 


pr{t,(2R,,)'< 7, <t,(2R,)¥} 
i. 
tends, uniformly in ¢, and ¢,, to n-* e dt. 
ty 


To complete the discussion we must show that the condition we have imposed on the 


sequence @,@,,... is satisfied by the coefficients of the solutions of stationary stochastic 
difference equations. Consider an equation 


X(t) +a,X(t—1)+...+a,X(t—h) = y(t) 
such that the roots of z’+a,z'-1+...4a, =0 (6) 


all lie inside the circle |z| = 1. Then the solution of this equation is given (Wold, 1938, 
p. 53) by 


X(t) = ¥ am(t—i), 








where 


and s 


and s 


tive. 
More 


whe 








P. A. P. Moran 285 
where the «, are now the solutions of the infinite set of equations 
% = 1 ; 
Ay Aq + Oy = 0, 
Ugh tA,% +A, = 0, 


OPC PSS S SESE HSE S ESE ESEE SEES EE SEES SEES EE SES 


Ay, Ay t+ Ap_, Ajit... +a, = 


Aj, +... FAA t+ Apis = 0, 


SOOPER SEE E SERS EE EEHE REESE SESE SEES SEEES 


and since the left-hand side is an absolutely convergent double series, we add, obtaining 
(l+a,+...+a,) » a,=1, 
=0 


and so >) a,+0 and, as aiready observed, without loss of generality, may be supposed posi- 
i=0 


tive. This quantity is finite because all the roots of equation (6) lie inside the circle | z| = 1. 
Moreover, it follows that 


«o 2 
R, = ( = ai) o* = (l+a,+...+a@,)-*o*. 
i=0 


This is, in fact, proportional to the derivative at zero of the integrated power spectrum 
(Wold, 1938, p. 69). 

We now turn-to the problem of discussing the relation between two such series and we 
consider the asymptotic distribution of 


8, = 5 X(-) ¥(-8, 
tel 


where X(t) = S ay(t—i) (a, +0), (7) 
i=-1 
and Yi) = ¥ Akt-i) (A+). (8) 


We write S,, in this form rather than that of (3) for the sake of convenience in what follows, 
and we have altered the notation of the sums (7) and (8) so that they begin with the 
coefficients a, and f, for the same reason. Writing 

¢, = E[X(t)X(t—s)], d, = BLY) Yt-s)] (s =0,1,...), 


as before, we have 


Cy = OYA + Ageyigt--), dy = F3,oi1t+ Peherat---), 
where o? and o? are the second moments of » and ¢. Then 


n 


E(S,) a 2 E(X(-t) Y(—2) = 0, 


= 2 
mis - | 3 x(-07(-0} 
= ES X*(—?) yX—1+2"5, ) X(—t) X(-t-s) ¥(-#) ¥(-t~3] 


=1 =1 s=1 


= Nydy+ 2 = (n—8)¢,d,. (9) 














286 Some theorems on time series 


Consider the behaviour of n-1#(S?) as n increases. Clearly 
n-1E(S2) cod, +2 > c,d,=C, say, (10) 
s=1 


if the series C is absolutely convergent. If X and Y are moving averages or the solutions of 
stationary stochastic difference equations this is certainly true, for in the first. case the series 
is finite, and in the second it is majorized by a convergent geometric series. We show that it 
is true in the general case by the following argument. Without restricting generality, we 


may assume, as before, that > | a;| <1, >| ;|<1. Then 
I I 


| c, | <o?(| wa, | +...) 


<o?(|a,|+...), 








and so Sead, <¥Jeud,| <o8 | 4, 

<otet © (Al |Aal+--) 

<otoi( 5 |Al) - (11) 
Also cody <otoi{ 3 | a, ?) (> |B; ?). 


and 80 ¢)d,)+ 2 >\c,d, is finite. We now prove that C is not zero. For 
1 


«o oe @ @ 
C = ¢ydy+2 x esd, = ofall (a+ 08+...) BB+ AR...) +2 ; ae - E antmiebnBase | 


s=1 m=1 n= 
and after some rearrangement, this equals 


OOH (1A )* + (1 Ba + %gA)* + (a Py + Agha + og h;)*+...], 
and (a, £,)* is greater than zero and the rest non-negative at least. We therefore conclude that 


nu E(S?)>C,. 
where 0<C<o. 
Assuming as before that S| a,| <1, E1 A] <1, 
1 I 
we define N so that > |a,|<eS|a,|<e, (12) 
N+1 I 
> |B: <eS| Bi] <6, where ¢ is small. (13) 
N+1 I 
N 
We now write X,(t) = ¥ a,n(t-2), (14) 
i=1 
N . 
¥,(¢) = = A S(t—1), (15) 
ls 


and consider the sum S,= 5 X,(-#)¥,(-#). (16) 
I 





fol 


P. A. P. Moran 287 
We begin by proving that when n is large this sum tends to be distributed in the normal 
form with a variance which is asymptotically equal to nC,, where C, is obtained from C by 
10) putting a; = £; = 0 fori>N. For it is then clearly true that 
s of n B(S2)>C,. 
ries n 
t it Now consider S,= Ss X,(—t) Y,(—4), 
we 
where n is greater than N. For cqnvenience of notation, we write 
m= 1-1), & = O(—2). 
We then have S,= % X Aynk; 
i=1j=1 
where the A;; are certain constants. Moreover 
E(n; 65) =0 all i, js 
E(niG) = = oj0% all ¢, 9; 
11) E(nzo;Se) = E(aym 63) = 9 for j+k, 
E (4,0; Sx&) = 0 if i+j or k+l. 
It therefore follows that E(S;,") = cjo3 & X A},. 
’ i=1 j=1 
Inserting (14) and (15) in (16) we have 
A;; = 0 
| if t>n+N, or j>n+N or |i-j|>N-1, 
N N 
where hi = >» xX 4575 
hat i=1 j=1 
with Aj; = a, 2; + 1 Bj_4 + eee + O144-5P1 for + >j, 
= a, 83+ +0 8345-4 for <j, 
= @,8,+ +a,8; for i=). 
We also have 2_ = 22A;;9;5;, 
(12) where the sum is taken over values of ¢ and j such that |t—j|<N, i<n,j<n and either 
N <i or N <j, where 
(13) Ai = Boyt... +ay_phy for j-t=p>0 
= Opi Ait...t+ayhy, for t-j=p>0 
(14) = 4,8, +...t+ayhy for ¢=/j. 
Then E(2,)=0, H(2})= ofa} ZZ A}, 
15) where the sum is taken over the above values of i and j. This equals 
16) (n—N) ojo3[(% Ay)? + (%Pyrt+%hy)*+--- 
+ (a2, +... +ayBy)* +... + (Gy ho +ey_1h,)? + (ey A,)*)- (17) 











288 Some theorems on time series 


We know that a,+0. Let £; be the first term of the sequence fy, fy_,, ..., 8, which is not 
zero. Such a term certainly exists. Then the sum in the outer brackets of (17) will contain 
a term of the form («, £,)* and consequently (23) > 0, and for N fixed will increase as (n — NV). 


Next we have Ly = IZAyn Lp 
where either t<n,j>n and j-—i<N, (18) 
or j<n,i>n and i-j<N, (19) 
and 


Aj = Opi: Pi t+...+ayhy, for i-j=p>0 
= %Byiit-..+ay phy for j-i=p>o. 
Then E(2,)=0, H(2}) = oto} ZA}, 


where the sum is taken over the values (18) and (19). 


ntN n+N 
Finally M= LT = Ayn, 
t=n+1 j=n+1 
where Ay = a; nBi-p-nt--- +n By_y for tj =p>0 
= O;_p -nBjnt --- +Ay_yhy for j-t=p>0 
= ah, +...4@yhy for i=j=n+p>n, 
n+N n+N 
and K(X) =9, E(X)=cjo5 DL Ly. A}, 
t=<n+1 j=n+1 
We readily see that E(Z,2;)=0 for i+j 
and therefore E(8i2) = E(53) + B(E3) + E(Z2) + E(2}). 


Moreover, for constant N, H(2}), H(2}) and E(2}) are constant, and so for large n we have 
n-B(S,*) > C, = ofo3[(@Ay)* + (0 Byrt%byPt... 
+ (af, +... +ayBy)* +... + (ay f,)7} +0. (20) 


Now suppose that JV is fixed and consider the sum > X,(—t) Y,(-—t). We write 
I 


n= m(m+N)+>p, 


where p < 2m + N +1 and n is large enough for m to be greater than N. This equation fixes 
m which increases roughly as n* when n increases. Write 


N N+m m*+mN n 
s,=( + ¥ +...+ >» * = )X-0K(-0 
t=-1 N+1 l+mN+m-1) n-p+l 


=V,+U,4+K,4+U,+...+V_y+U,+ W. 


Then V,, ...,V,, and W are all independent and H(V}), ..., H(V2,) are independent of n, and in 
fact not greater than KN, where K is a constant independent of N. Also E(W?) is not 
greater than K(2m+N-+1), where K may be taken as the same constant. U,,...,U,, are 


also all independent and H(U?,) is asymptotically equal to mC, when n (and therefore m) 
are large. Therefore, writing 


A, = U,+...+Uy, By =V+...+V_+ W, 





we | 


anc 


19) 


ive 


0) 


ot 
re 





P. A. P. Moran 289 


we have E(A,,) =0, E(A%) = >» E(U}), 


E(B) =0, E(Ba)= 3 B(V)+ EW"), 
and the latter increases as m, whilst the former increases as m? and so 
E( Bi) {E(A3)}-+0 


as ” increases. 
By Lemma 1 it is therefore sufficient to show that the distribution of A,, tends to normality. 
Lemma II (Liapounoff’s Central Limit Theorem, Bernstein, 1927). If 
Jin = UM +... + Um 
is the sum of m independent quantities such that 
E(u) = 0, E(u") = vo, (ule) = oo, 


and if, as m increases, 6-¢ = c™ > 0, 
where é, = bY bm) = E(S2), 
then pr{to( 2, <2 < ty(26,,)¥} 
tends, uniformly in ¢, and ¢,, to — nt [retae 


To apply the lemma we put U, = ui”. We already have E(U,) = 0. Also 
m-1E(U2)->C,>0 by (20), 


and so m~*b,, > C. 
Now consider c™ = E(U8), 
where U, = > : 


et Ang Min+r—-v+p-1 SAN 4e—D4¢—1? 


and the A,,, are calculated with m in place of n. Since the 9’s all have the same probability 
distribution and similarly for the ¢’s, we shall write 9, and €, for 94 4-~1)4p—1 804 Cuwiewaq-1 
for the sake of convenience. So we can write the above 


U,.= X Ayg Ip Co 
p=1@=1 
U‘ will be a polynomial of the fourth order in the 7’s and the ¢’s and its expectation may 
be regarded as the sum of two distinct types of terms so that H(U*) = 2E(w,)+ZE(w,), 
where the terms w, are of the form A$, 7§ 4, and the terms w, are of the form 43, A}, 93 Gi. G 


with (p,q) + (k,l). All other tezms arising in the product will clearly vanish when the ex- 
pectation is taken. 


. Then, since the A,, are bounded and the number of non-zero terms in w, and w, are not 
greater than 2N(m+ N) and 4N°(m + N)? respectively, we have 


E(U$) < Km’, 
where K is a constant depending on N but independent of m and n. It follows that - 


bats OM 
1 











290 Some theorems on time series 


is of order m— and tends to zero as n and m increase. The conditions of the lemma are there- 
fore satisfied and we conclude that 


pr {to(2H(Aj,))* <A, <t,(2E(Aj,))} 
t 
tends, uniformly in ¢, and ¢,, to r+ e* dt. 
ty 
Applying Lemma I we have 


pri{t[2H(S,2)}* < 8, <t[2H(S,2)}} 


tends, uniformly in ¢, and ¢,, to the same limit. 
We now consider the relationship between S/, and S,,. Write 


St = 8,— 8%, = 3 X(-0) ¥(-#)-EX-¥%(-1) 


5 (Sm) (54s) (Sam) (Ato) 


t=1 \iql 
n @ @ 
= >» a2) ( x B; tas) 
t=1 \i=N+1 j=N+1 


+ >» (5, anss) (> A; tus) 


t=1 
an N © 
+B (Rat) (3, he) 

= ++, (21) 
We must now calculate the variance of these terms. Consider again (9). We have shown 
(11) that 


25.644, <20%0%(| fol +|Ar|+--? 
< 2o902(| ay | +| x, |+...)%, 


and we now apply this to the three sums in equation (21). It follows that if N be chosen to 
satisfy the conditions (12) and (13) then 


lim n E(W3) < Kotoze?, limnE(W3)< Kotoze?, limn—E(W2) < Kotoze?, 
where XK is a constant independent of N. 
It follows that S, = 9. 4W,4+W4M, 


where the variance of W,, W, and W, can be made small compared with that of S,, by choosing 
N large. Then by first choosing N large and then n and using Tchebycheff’s inequality, we 
see that the distribution of S,, tends to normality with variance H(S?) and this completes 
the proof. 

In the general application of the above results some care is needed. We can suppose that 
our empirical values of X and Y are distributed about their sample means which we take 
to be zero and we must estimate the variance of S,, from formulae (9), or (approximately) 
from (10). But we must not insert in this formula the sample covariances for the c, and the 
d, because, as Bartlett (1946) has shown, the standard errors of the sample values of these 
covariances are of order n-* and we cannot therefore expect the series (10) to converge, let 





an 


ac 


ere- 


21) 


wn 


ng 
we 
iS 


at 
ke 


y) 
he 


se 
et 








P. A. P. Moran 291 


alone give the correct value. To use the formula correctly we must first decide on the order 
and coefficients of the stochastic difference equation which we can suppose generated the 
series and, from these coefficients, calculate the value of (9). 

In the case where the series are generated by a three-term difference equation, the calcula- 
tions are simplified. Suppose the X and Y satisfy the equations 


X(t+2)+ aX(t+1)+ bX(t) = y(t+ 2), 
Y(t¢+2)+AY(t+1)+ BY(é = Ct+2), 
where E(y(t)) = E(g(t)) = 0 
and E(y*(t)) = 05, E(S*(t)) = 05, 
as before. For the series to be stationary, we must have b<1, B<1. We suppose that in 
addition to this the series are oscillatory and so a® < 4b, A® < 4B. The solutions will then be 


X(t) = > 2(46 — a?)-* p* sin 0s y(t —s + 1), 
s=0 


Y(t) = = 2(4B— A) P*sings {(t—s+1), 


where p = bt, P = Bt, cos@ = —a(2b*)-, cos¢ = — A(2B#)-. Also (Kendall, 1946, p. 408) 
c, _ p*sin(s0+y) — d,_ P*sin(s¢+®) 


" Co CO ie fe ® sin® ; 
1—p? 1-P? 
where tany = itp tand and tan@-- re: been 
1+6 1+B 


and d 








o= pa sop—ay’ "Faas BPA 


We then need to calculate 


C =¢od,+2 > c,d, 
s=1 


Cody f +2 z rR 


p’ P® sin (86 + y) sin (sd 2) 
ont sin yy sin® 
iad dal 2pP cos (yy —®+ 0—¢) —pP cos (yy —®) 
~ Orel" sin w sin® 1—2pP cos (0—¢) + p?P? 
_ cos(y+O+0+¢)—pP cos (y+) ]\ (22) 
1 —2pP cos (6+¢)+p?P? i: 
It is probably easiest to calculate C from this equation rather than attempt to simplify 
(22) still further. I hope to discuss the practical application of these formulae in another 
paper. 





cad + 








REFERENCES 


Wo tp, H. (1938). A Study in the Analysis of Stationary Time Series. Uppsala. 
BERNSTEIN, S. (1927). Math. Ann. 97, 1. 

BartteEtt, M. S. (1946). Supp. J. Roy. Statist. Soc. 8, 27. 

KeEnpatt, M. G. (1946). Advanced Theory of Statistics, 2. London: Charles Griffin and Co. 











[ 292 ] 


RANK CORRELATION BETWEEN TWO VARIABLES, ONE OF 
WHICH IS RANKED, THE OTHER DICHOTOMOUS 


By J. W. WHITFIELD, Psychological Laboratory, University of Cambridge 


Rank correlation is one of the most useful statistical techniques available for the treatment 
of data arising in experimental and applied psychological research. Chambers (1946) has 
indicated the type of data most frequently occurring in these fields, and has pointed out the 
advantages of Kendall’s 7 over Spearman’s p or any form of transformation to ordinal form. 

Given the use of 7 when tied rankings are present (Kendall, 1946) it seemed possible to 
extend the method to cover a very common problem ir psychology, namely, determination 
of the relation between two variables, one of which is expressed as a ranking and the other 
as a dichotomy. In applied or field work the reiation of a psychological ‘measurement’ and 
ari external criterion nearly always appears in this form. The usual method of determining 
the relationship consists of reducing the ranking to a dichotomy and calculating y? for the 


2 x 2 table which results. That this may lead to inaccuracy can be seen from the following 
example: 


VariableA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
VariableB + + + + + + + —- —- — + + + - = = = = = = 
VariableC - - - + + + + + + + - = = = = = = + + + 


Here the data are supposed to be ranked according to variable A and dichotomized into 
+ and — with respect to variables B and C. 


Treating the relation between variables A and B as a 2 x 2 contingency table: 





Variable B 





Variable A: Rankings 1-10 
Rankings 11-20 


waa + 
“169 

















Applying x’, P is found to be 0-074 without Yates’s correction for continuity, or 0-180 if 
the correction is applied. 

But x? is exactly the same for the contingency table relating variables A and C, although 
it is obvious from the data that there is considerable difference in the two relationships, the 
evidence for which is sacrificed by reducing the ranking to a dichotomy. 

If, alternatively, we consider the dichotomous variable as a ranking composed entirely 
of two sets of tied rankings, we may calculate the coefficients between A and B, A and C 
respectively which I shall denote by 74, T4c¢. The corresponding values of S will be found 
to be, after the manner described by Kendall (1946): 

S=+70— 9421 = +82, 
S =—30+4+49-21=-— 2. 

For the calculation of 7 in the case of tied rankings we have a choice in the denominator 

by which S is to be divided to give r. In the untied case this would be 4n(n-- 1), where n is 





TI 


a 
I 
é 
] 


ly 


id 


or 





J. W. WHITFIELD 293 


the number of ranks. In the tied case we may take the denominator S’ as }n(n—1) or as 
[}n(n — 1) {4n(n— 1)—424(t—1)}}*, where t,, t, ete., are the extent of the ties. The choice is 
determined by practical considerations (see Kendall, 1946), but is not material to a discus- 
sion of significance. For an untied ranking and a dichotomy with ‘x’ and ‘y’ members in 
each class, the second form reduces to {42yn(n — I)}}. 

In the case of two untied rankings Kendall has shown that var S = Jgn(n— 1) (2n+5). 
In the case of one untied ranking and one with ties of extent &, t,, etc., Sillitto (1947) has 
extended this result by proving that 





var S = sy{n(n—1) (2n+5)—De(t— 1) (2t+ 5)}. (1) 
In the case of an untied ranking and a dichotomy, t, = 2, t, = y, (+y) = mn, and we have 
then the simple form var S = 4zy(n+1). (2) 
In the example above this gives 
__ (10) (10) (21) _ Aer S,p ‘82-1 . 
var S = eens eee 700, ./(var.S) = 26-46, Jivar 5) ~ 26-46 ~ 3-06. 


The probability of a deviation greater than this in absolute value is 0-0022. Further, 
Saco _ 2-1 
(var S) 26-46 

and the corresponding probability is 0-970. 





= 00378, 


VARIANCE WHEN THERE ARE TIES IN THE RANKING 


The variance of S given by equation (2) is true only in the case of a dichotomy and an untied 
ranking. For a tied ranking I surmised from some special cases that 


var S = San 1)" —") - 2-2}. (3) 


In the note following this paper Mr Kendall provides proof of this result. 

Example (from data collected by the Medical Research Council team in Germany 1946, 
as yet unpublished). Selected workers in a factory were interviewed and an assessment 
made of their adaptation to living conditions.. They were assessed as ‘Efficient’ or ‘Over- 
active’. Other data were available, including statements by the men of frequency of nocturia. 
For men aged 50-59 years the following was observed: 





Rank order of frequency of nocturia 
Assessment (least frequent nocturia given highest rank) 





Efficient 24, 24, 23, 24, 64, 6}, 10, 10, 10, 10, 14, 14 
Overactive 5, 10, 14, 16, 17 














Five is the highest ranking in the overactive group. Four members of the efficient group 
have higher rankings, and eight lower rankings. The S score for that member is therefore 
4—8. Similarly, for all members we have 


S = 4-—84+6-—2+4+ 104 12+12 = +34. 











294 Rank correlation between two variables 
Using a denominator in the form 


[xy{dn(n — 1) — 24(¢—1)}}, 
T is given by 


a+ = ae eee | 
a/[(12) (5) {$(17) (16) — 3(4) (3) — $(2) (1) — 3(5) (4) — (3) (2)}] 4/6960 


From (3) we then have 


(12) (5) 
var S = a7" 17) — (48 — 4) — (28 — 2) — (58 — 5) — (3° 3)} 


= 344-6. 





A small problem arises when we consider the correction for continuity to be applied in 
testing the significance of an observed value of S. In the case of a dichotomy and an untied 
ranking the interval between successive S values is 2. In the case of a dichotomy and a 
ranking composed entirely of ties of the same extent ‘t’, the interval is 2¢. But in the example 
the ties are of varying extent, and the interval between successive S values is composed of 
a mixture of the intervals produced by the successive rank values. Thus, although these 
varying intervals are combined so that over most of the range the interval between successive 
values is unity, the distribution oscillates somewhat, and to use the value } as the correction 
for continuity would sometimes be misleading. Further work is required to determine the 
correction which will provide a probability on the normal distribution equal to or slightly 
greater than the true probability in all cases. Until this is available I propose to use a crude 
correction, based on the average of the intervals mentioned above. In the example the suc- 
cessive rank values 2} and 5 give an interval of 5 in S score, rank values 5 and 6} give an 
interval of 3, and it is therefore possible to determine the average interval by calculating 
the intervals given by successive rank values. This calculation can be shortened. The total 
of the S score intervals is twice the number of members, less the extent of the ties involving 
the first and last members. If we divide this by the number of intervals between successive 


rank values we have the average S score interval. In the example this is }(34— 4-1). Using 
half of this as the correction for continuity we have 


S 34— 2-42 
J(varS) =: 18-56 iio 


The pre-observational hypothesis, made on psychological grounds, was that excessive 
nocturia is a symptom of inefficient adaptation to living conditions, i.e. a positive correlation 
should be obtained. From these observations the probability of a positive correlation as 
great or greater than the observed value appearing by chance is 0-044. Direct calculation 
of the positive tail of the distribution of S gives a probability of 0-0368. 

The alternative testing hypothesis based on the absolute value of S gives a probability 
twice as great, and the corresponding direct calculation using both positive and negative 
tails of the actual S distribution gives a probability of 0-0735. 

By itself this evidence could only be debatable substantiation of the psychological 
hypothesis. In fact, additional data from two other factory groups, treated in the same 


way, gave a total S value of + 104, the square-root of the total variance being 35-00, providing 
a justification of the hypothesis. 








lo; 
in 





J. W. WHITFIELD 295 


THE CASE OF THE 2x 2 TABLE 


If one dichotomous variable can be considered as a ranking with two sets of tied ranks it is 


logical to consider the case when both variables are in this form. If we have a 2x 2 table 
in the form 











(AB) (Ab) (A) 
(aB) (ab) (a) 
(B) (6) N 

















any member of (A B) taken with any member of (ab) has the same order in either ranking 
and hence contributes +1 to S, and any member of (Ab) with any member of (aB) con- 
tributes — 1. The others contribute nothing. Hence 


S = (AB) (ab) — (Ad) (aB). 
From equation (3) 


(A) (a) 


var S.= SW —1)N*— 4) — (BY — (BY —{OP— OH] 
(A) (a) (B) (6) 
ign aie (4) 


Again, for testing the significance of an observed value of S it is necessary to correct for 
continuity by subtracting half the interval between successive S values. In the case of the 
2 x 2 table the interval is N, for if we increase (A B) by unity S becomes 


{(AB) + 1} {(ab) + 1} —{(Ab) — 1} {(@B) — 1} = (AB) (ab) — (Ab) (@B) +N. 
Hence, for the normal deviate, we have 


S—3N 
] (4 (a) (B) @ (6) 
N-1 
It will be noted that 7 (taking the ties into account in calculating the denominator 8’) is 
(AB) (ab) — (Ab) (a@B) 
v[(A) (a) (B) (6))” 
which is the product-moment correlation for a 2 x 2 table when the variables are conven- 
tionally regarded as possessing the discrete values 0, 1. 

Testing by use of the normal deviate seems to be moderately accurate, and would appear 
to be useful in those cases where y? is.suspect because of small expectations in the cells of 
the 2 x 2 table. It is less laborious to calculate than the hypergeometric treatment, and is 
an alternative form of the approximation to hypergeometric treatment given by Pearson 
(1947), who also discusses the order of accuracy of the approximation. 

Using the data given earlier as an example, but assuming that it had been possible only 
to grade nocturia into ‘Normal’ or ‘Excessive’, we have the following table: 

Biometrika 34 














20 











296 Rank correlation between two variables 











Nocturia 
Assessment 
Normal Excessive 
Efficient 10 2 
Overactive 2 3 

















S = 30-—4= 26, varS = Oe) 225. 


This gives, after correction for continuity, 


S-}N 2-85 | 
J(varS) - 15 — seu. 


This gives the probability of S being attained or exceeded in the direction of the hypothesis 
(i.e. positive values only) as 0-1217.:x? without the continuity correction gives P = 0-0369,* 
and with the correction, P = 0-1143. The hypergeometric treatment, summing the prob- 
abilities of obtaining 3, 4 or 5 in the Overactive-Excessive category, gives P = 0-1166. 

If the more customary test of absolute value is applied, x? with Yates’s correction gives 
P = 0-2286, S and the normal deviate gives P = 0-2434, i.e. both values of P are doubled. 
The hypergeometric treatment, adding the probability of obtaining 0 in the Overactive- 
Excessive category gives P = 0-2445. 

It will be seen that in conditions uch as these, S and the normal deviate give a reasonable 
approximation to the exact treatment. 





* This is making the common assumption that {(AB) (ab)—(Ab) (aB)}* N/{(A) (a) (B) (6)} is dis- 
tributed as x? with 1 degree of freedom, or that its square root is a normal deviate with sign depending 
on the sign of (AB) (ab) — (Ab) (aB). 


REFERENCES 


CuaMBeERS, E. G. (1946). Statistical techniques in applied psychology. Biometrika, 33, 269. 

KENDALL, M. G. (1938). A new measure of rank correlation. Biometrika, 30, 81. 

KeEnpatt, M. G. (1946). The treatment of ties in ranking problems. Biometrika, 33, 239. 

Prarson, E. 8. (1947). The choice of statistical tests illustrated on the interpretation of data classed 
in a 2x2 table. Biometrika, 34, 139. 

Smurrro, G. P. (1947). The distribution of Kendall’s coefficient of rank correlation in rankings con- 
taining ties. Biometrika, 34, 36. 





on- 





[ 297 ] 


THE VARIANCE OF 7 WHEN BOTH RANKINGS CONTAIN TIES 
By M. G. KENDALL 


1, The variance of 7 in the population of sample permutations was given in my paper of 
1938 for the case where no tied ranks exist. Mr Sillitto (1947) has given the formula where 
one ranking contains ties but the other does not. In the foregoing paper Mr Whitfield has 
correctly surmised the variance when one ranking contains ties, and the other is a dichotomy. 
In this note I derive the general formula for the variance when both rankings contain ties. 
The results of Messrs Sillitto and Whitfield then follow as special cases. 


2. I shall follow the method of Daniels (1944). If a,, represents the contribution of the 
ith and jth members of a ranking to 7 we have 


@;=+1 (t<j) 


= 0 (i=j) 
=-1 (i>j) (1) 
We write C3 = ai;5;;, (2) 

where a and 6 refer to different rankings, and 
a 
c= >> Cj;- (3) 
4j=1 

The quantity c is simply related to S by the relation 

c = 28, (4) 


and for the testing of 7 it is sufficient to test c or S which are merely constant multiples of 7. 
I work with the quantity c. 


3.. We have, from Daniel’s results, 


Z ag = n+1-2i, (5) 

i=1 
E a}, = n(n—1), (6) 

i,l=1 
SE ag ay = yn{n*— 1), (7) 

i,lt=1 
E(c) = 0, (8) 
2 
E(c*) = eee {Za.q— Laz} {Lbybg— Xb}} + n(n—1) La}, Xb},. (9) 
If we substitute from (6) and (7) in (9) we find 

E(c?) = §n(n— 1) (2+ 5), (10) 
or, equivalently, E(S*) = pgn(n— 1) (2n+ 5), (11) 


from which the variance of 7 in the case of untied rankings follows at once. 


4. Now suppose that sets of #,, é,, ... consecutive members in one ranking are tied. In 


place of (6) we then have n 
> a3, = n(n—1)—ZTtt-1), (12) 
it=1 
20-2 














298 The variance of + when both rankings contain ties 


the summation on the right taking place over the various values of t. This result foiiows 
simply from the consideration that for a pair of tied ranks a;; = 0, and consequently the sum 
of squares of contributions from a tied set is of the same form as for the ranking as a whole. 

In place of (7) we have n 
4, = n(n? —1)— 424-1). (13) 


This is not quite so obvious. Consider a set of tied ranks. The contribution to the sum on 
the left of (13) will be unchanged if the suffixes I, ¢ fall outside this set. If they both fall 
inside, no contribution arises and therefore we have to subtract the term 4¢(é?— 1). The re- 
maining possibility is that one falls inside and one outside. In such a case the contribution 
remains unchanged in total for it is zero in the original untied case, each possible pair occur- 
ring once to give + 1 and one to give —1. Formula (13) follows. 


5. By substitution in (9) we then have, for two rankings with ties typified respectively 
by ¢ and wu, 
E(c*) = 


x {4n(n — 1) (n— 2) —42u(u— 1) (u—2)} 


2 
Fem ae i ale — 1) — Zale — I}. (14) 
This is the general formula required. We can express it in the alternative form 


E(c?) = §n(n— 1) (2n + 5) — §2t(t— 1) (2¢+ 5) —ZEu(u— 1) (2u+ 5) 





4 9 
+ iar l)@r dt NE—DHEa(e— 1) (w—2)} 


+o alt 1)} {2u(u — 1)}. (15) 

6. (i) If one ranking is untied, say all the w’s are zero, we have Mr Sillitto’s result 
E(c?) = n(n —1) (2n +5) —ZEt(t— 1) (2¢+5). (16) 
(ii) If one ranking is untied and the other is a dichotomy into z and n—x = y members, 
(16) reduces to E(c?) = $ay(n+ 1), (17) 


agreeing with Mr Whitfield’s equation (2). 
(iii) If one ranking contains ties and the other is a dichotomy we find on substitution 
in (14) 





E(c?) = ian "lente dim (18) 


agreeing with Mr Whitfield’s equation (3). 
(iv) Finally, if both variates are dichotomized into x, y and p, qg we find 


4 
Hic) = 2 





agreeing with Mr Whitfield’s equation (4). 


REFERENCES 
See the references to Mr Whitfield’s paper together with: 


Dantets, H. E. (1944). The relation between measures of correlation in the universe of sample per- 
mutations. Biometrika, 33, 129. 


’ (19) 





Ss 02 aba @ & 


~ © ome ee oe oe 





[ 299 ] 


A x? ‘SMOOTH’ TFST FOR GOODNESS OF FIT 
By F. N. DAVID 


1. The x? test occupies a central position in statistical theory, and it is difficult to imagine 
another test which would have the same generality of application. We shall be concerned 
here with one aspect only, that is, the uses of y* in the tests for agreement between hypothesis 
and observation which are usually loosely classed together under the name of tests for 
goodness of fit. The principal advantages of x? for such tests would seem to be (i) that it is 
applicable te grouped observations, (ii) that the parameters of the hypothesis tested may be 
czlculated from the observational data and the fact allowed for in the degrees of freedom with 
which the criterion is assumed to be distributed and (iii) that it is easy to calculate, for the 
number of computations involved is just the number of groups into which the observational 
data are divided. It has, however, two defects which have long been known and which are 
easily recognized from the form of the criterion itself. Broadly speaking we may define 
x? as follows: 

(Observed velue — Expected value)? 


2 ¢ 
x* = sum for all groups of Expected value 





It will be seen (iv) that in taking the square of (Observed value— Expected value) the 
knowledge regarding the sign of the deviation is lost. Further (v) there is no means of 
preserving the order of the signs of the deviations, and no distinction can therefore be made 
between a departure from the hypothesis tested in which the deviations were first all positive 
and then all negative and a departure in which the sizes and signs of the deviations were 
random between themselves. 

2. The ideal test for goodness of fit should certainly take into account (i), (ii), (iv) and (v) 
and probably (iii) also, but it is unlikely that this ideal will be reached. It would seem at 
the present time that the most which may be hoped for are tests which will sup plement the 
x? test in that they will be more sensitive to given alternatives of the hypothesis under test. 
Neyman (1937) put forward such a supplementary test in which he developed the y¥* 
criterion. This criterion was designed to be sensitive to alternate hypotheses of a type he 
designated as smooth; that is to say, if the hypothesis under test is a continuous curve, such 
as, for example, the normal curve, admissible hypotheses alternate to it might be other 
normal curves with a different mean and a different standard deviation. Neyman’s 
criterion certainly took into account (iv) and (v), but it did not fulfil (i), in that it was only 
applicable to ungrouped observations,* or (ii), in that the parameters of the hypothesis 
tested had to be known a priori. Whether it fulfilled (iii) is a matter for personal 
opinion. 

3. It would appear possible that tests which would take into account the sign only of the 
deviations of observation from hypothesis, and the order of these signs, may be devised 
from simple combinatorial principles. Suppose that there are N observations which are 
divided into k groups. Let n,; (i = 1,2, ...,%) be the actual number of observations falling 


* Prof. Neyman tells me that his criterion has been adapted for the case of grouped observations 
but that he has not yet published this extension. 











300 A x? ‘smooth’ test for goodness of fit 


into the ith group, and let m, (i = 1, 2, ..., k) be the expected number. It is possible theoretic- 
ally for x* to be calculated for the case where 


A k 
»Y a = N+ >» Mj, 
i=-1 i-1 


but such cases must be rare in statistical practice. We shall overlook this case and will 
consider the case where the totals of observed and expected are made equal to one another 
with the resultant loss of one degree of freedom in the calculation of x?. If the totals agree then 

k k k 

xX%- LMm= YF 6;=0, 

i=1 i=1 i=1 
where 6; = n,;—m,. In order that the sum of these é’s should be zero, at least one of them 
must be negative in sign, but which one of these é’s it will be would seem to be a matter of 
chance. It is on this fact that we shall base the first test criterion. 

4. Suppose that we have a sequence of signs of which r, are positive and r, negative, 
where 7, +7, = r, andr, >0 andr, > 9. These signs are postulated to occur in a random order. 
Given such a sequence it is easy to record the number of sets of positive and negative signs. 
For example, if the sequence is 


++—--++--+-+4++4++4-, 

then r = 15, r, = 9, r, = 6, and there are four sets of positive signs and four sets of negative 
signs. In general there can be («) ¢ positive, ¢ negative, or (f) ¢ positive, ¢+ 1 negative, or 
(y) t+ 1 positive and ¢ negative sets of signs. If 7’ = 2t or 2¢+ 1 as required, we may ask what 
is the probability that given r, and r, such a number T' of sets (alternately positive and 
negative) would have arisen through chance. This probability follows at once from Whit- 
worth, Choice and Chance, Proposition xxv, viz.: “The number of ways in which n indifferent 
things can be distributed in ¢ different parcels (blank lots being inadmissible) is 


(nm —1)!/(t— 1)! (n—-#)!’.* 
5. The total number of ways in which r, and r, elements can be arranged is 
r! 
r,!r_! 

We now require to enumerate the number of Ways in which r, can be arranged to form ¢ sets 
and r, to form ¢ sets. To arrange 1, in ¢ sets is equivalent (vide Whitworth) to making ¢—1 
breaks in a sequence of r, observations, and this may be done in 

(r,—1)!/(¢—1)! (7,2)! ways, 
and similarly for r,. It is not specified whether + or — should start the sequence, and hence 
the total number of ways in which a sequence r, +7, may be arranged-in ¢ sets each is 

(¢—1)!(t— 1)! (r,t)! (rg -8)V 





* Since I first thought of this method of attack I have found that the distribution of groups as 
given by me in §5 has already been given by W. L. Stevens, Ann. Eugen., Lond., 9, 10, and by A. Wald 
and J. Wolfowitz, Ann. Math. Statist. 11, 147. The probability function has been tabled by F. S. Smed 
and C. Eisenhart, Ann. Math. Statist. 14, 66, but it is not in a form that I found suitable for my pur- 
poses. The probability function has been known for many years; what is interesting is the different 
uses to which it has been put. 





Las} 


DD -e~ eh © 26 = S| = CO 


of 


ve 
or 
at 
id 
it- 
nt 





F. N. Davip 301 


The probability of 2¢ sets will be 


" 2r,'(r,—1)!r_!(r2—1)! 
P(t | rota = FEIN! - 


and the probability of obtaining (2¢+ 1) sets will be 
P(2-41| 74,74} = Pft|1y,¢+1|1rq}-+ Pft+1| r,t] ra} = Pf2t|r,,79} (=). (2) 





Hence given r,,r, and 7 from a random sequence of positive and negative signs the 
probability of such a number of sets having arisen through chance may be calculated. 

6. It is desired to use the probability of a given arrangement of signs in order to test a 
given hypothesis represented by a smooth probability law, bearing in mind that, if the given 
hypothesis is not true, then any alternative law is likely to be of a smooth type. Although 
no exact definition of a smooth alternative distribution has been made, it may be stated 
here that smooth, in the sense used by Neyman, will imply that the number of sets of signs 
will be small. For example, if the hypothesis tested is that observations follow a given normal 
curve, whereas in fact they have been drawn from a normal distribution identical with the 
first but with a smaller mean, then the differences between obsérvation and expectation on 
the basis of the hypothesis tested may be expected to give a preponderance of positive 
signs below the sample mean and of negative signs above it; that is to say, if the difference 
in means is sufficient to offset the sampling fluctuations we should find a single set of positive 
signs followed by a single set of negative signs. If the true population is a normal curve with 
the same mean but with a larger standard deviation than that specified by the hypothesis 
tested, then there will be a tendency towards a set of positive signs, a set of negative signs, 
followed by a set of positive signs, although sampling fluctuations may not leave such a 
clear-cut answer. The more complex the alternative hypothesis the less chance there will be 
of detecting it. 

7. With this objective in view it is proposed to take 7’, the number of sets of signs, as 
the test criterion, rejecting the hypothesis tested whenever, for a given r, and rz, 7’ is excep- 
tionally small. This we do on the grounds that the existence of very few sets of signs suggests 
that the differences between observed and expected frequencies are not due to chance sam- 
pling fluctuations but to some systematic departure of the true probability law (assumed 
smooth) from hypothesis. In following this procedure we should reject the hypothesis if 


T. 
PIT<T} = ¥ P(T|n1} <6, 


where J, is the observed value of 7’ and ¢ the significance level selected as appropriate. 
Exact probabilities are given in Table 1, and the application of the test is immediate.* 
There seems to be no reason why the test should not be applicable to both grouped and 
ungrouped observations, although the formulation of the hypothesis tested may be some- 
what different in the two cases. Consider a sample which has been supposedly randomly 


* An assumption implicit in the test would appear to be that for each x? cell there is an equal chance 
of obtaining a positive or a negative deviation, that is, that there are sufficient numbers in each cell 
for the binomial to be closely approximated to by a normal curve. An extensive series of random sam- 
pling experiments has shown, however, that the divergence between theory and practice is not sig- 
nificant even when the probability of obtaining a positive is four times that of obtaining a negative. 
Hence while strictly the expectation in each cell of y* should be 10 or over, it would seem that for 


practical purposes that the 7' test may be applied in all cases where the application of the x* test is 
permissible. 








The function tabled is 2(r,— 1)! (7s— 1)! 


(ry—1)! (rg—1)! (71. +172 — 24) 





Table 1. Probability of obtaining a given number of sets, T. [T' = 2t or 2¢+ 1] 





@—I!¢—)!@,—-)!(,—-)! 
is even or odd. 


P{2} is obtained by dividing this function by the binomial term 


t\(¢—1\! (r7,—2)! (72-2)! 


r' 


: 


, according as T 
































ry! rt" 
1 
rir] |——!r=2} 3 | 4/5 | 6/|7 {8 | 9 | 10 | 1 | 12] 13 | 14 
T,!7,! 
2}1]1 2] 2 
3/2/11 3 | 2 1 
a4i3)1 ek 2 
2 | 2 6 | 2 2 2 
5141/1 5 | 2 3 
3 | 2 10 | 2 3 4 1 
6} 56] 1 6| 2 4 L 
4|2 15 | 2 4 6 3 
3 | 3 20 | 2 4 8 4 2 
7161/1 7] 2 5 
5 | 2 21} 2 5 8 6 
4|3 35 | 2 5 | 12 9 6 1 
s|7|1 To 6 
7 6 | 2 28 | 2 6 | 10 | 10 
5 | 3 56 | 2 6 | 16 | 16 | 12 4 
Se 7 | 2 6 | 18 | 18 | 18 6 2 
9/s]1 9] 2 1 ; 
17|2 36 | 2 7 | 12 | 18 
6 | 3 84 | 2 7 | 20 | 25 | 20 | 10 
5 | 41] 126] 2 7 | 24 | 30 | 36 | 18 8 1 
1 | 9/1 10 2 s 
8 | 2 45 | 2 8 | 14] 21 
7|3 | 120] 2 8 | 24 | 36 | 30 | 20 
6 | 4 | 210] 2 8 | 30 | 45 | 60 } 40 | 20 5 
5 | 5 | 252] 2 s | 32 | 48 | 72 | 48 | 32 8 2 
11/10 | 1 ll | 2 9 
9 | 2 55 | 2 9 | 16 | 28 
8s |3 1] 165 | 2 9 | 23 | 49 | 42 | 35 
7 | 4 | 330 | 2 9 | 36 | 63 | 90 | 75 | 40 | 15 
6 | 5 | 462 | 2 9 | 40 | 70 |120 |100 | 80 | 30 | 10 1 
° 12/11 | 1 12 | 2 | 10 
i0 | 2 66 | 2 | 10 | 18 | 36 
9 | 3 | 2200 | 2 | 10 | 32 | 64 | se | 56 
8 | 4 | 495 | 2 | 10 | 42 | 84 | 126 1126 | 70 | 35 
7 | 5 |*792 | 2 | 10 | 48 | 96 | 180 |180 |160 | 80 | 30 6 
6 | 6 | 924 | 2 | 10 | 50 | 100 | 200 | 200 | 200 |100 | 50 | 10 2 
is }12 | 1 13} 2] 11 
i 2 78 2 ll 20 | 45 
10 | 3 | 286 | 2 | 11 | 36 | 81 | 72 | 84 
9 | 4] 715 | 2 | 11 | 48 |108 | 168 |196 | 112 | 70 
| 8 | 5 |1287 | 2 | 11 | 56 | 126 | 252 | 294 | 280 1175 | 70 | 21 
' 7 | 6/1716 | 2 | 11 | 60 | 135 | 300 | 350 | 400 | 250 |150 | 45 | i2 1 
14/13 | 1 14} 2 | 12 
12 | 2 91 | 2 | 12 | 22 | 55 
11 | 3 | 364 | 2 | 12 | 40 |100 | 90 | 120 
10 | 4 |1001 | 2 | 12 | 54 | 135 | 216 | 288 | 168 | 126 
9 | 5 |2002 | 2 | 12 | 64 | 160 | 336 | 448 | 448 | 336 | 140 | 56 
8 | 6 |3003 | 2 | 12 | 70 |175 | 420 | 560 | 700 | 525 |350 |140 | 42 1 
7 | 7 |3432 | 2 | 12 | 72 | 180 | 450 | 600 | 800 .| 600 |450 |180 | 72 | 12 2 






































ae os x «6©hOlCUCO 





nll 











F. N. Davip 303 


drawn from some population. Let the elements of the sample in order of drawing be 
Ly, Lg, ...,%,. We may use the 7' criterion to test the hypothesis of randomness, in the fol- 
lowing way. If wu, is the smallest value observed in the sample and w,, the le rgest, then if we 
exclude the trivial case when all the z’s are equal it is easy to show that u,<%<uu,, where 
l n 
== = > 2z;. If we now consider the deviations 
i=1 
a,-Z=dé2, for +1=1,2,...,n, 
there will be a series dx,, dx, ...,dx,,, some of which quantities will be positive and some 


negative. The application of the 7' test is immediate, the admissible alternate hypotheses 


being that if the drawing of the sample is not at random then bias of the smooth kind is 
present. 


8. As an illustration consider the following two cases: 


Case I. Expected frequency 10 25 35 75 155 155 75 35 25 10 
Observation 12 29 45 81 160 145 69 31 20 8 


Deviation + + + + + — ss eS a 

Case II. Expected frequencv 10 25 35 75 155 155 75 35 25 10 
Observation 12 23 45 66 161 160 69 36 20 8 
Deviation + - a _ + + — + aan is 


In the first case y? = 6-94 and in the second x? = 6-80; in neither case would the hypothesis 


be rejected as inadequate by using the x? criterion. The 7' criterion does, however, bring 
out the essential difference: 


Case I. r,=5, re=5, Ty=2 and P{T<T} = sy. 
Case IT. r,=6, re=5, Tye=8 and P{T <7} = #4}. 


Using the 7' test we should be inclined to reject the first hypothesis in favour of a smooth 
alternative, while for the second case we should be inclined to agree with the conclusion 
drawn from the x? test that the observational material is adequately described. 

9. Sampling material is available whereby the theoretical distribution of 7’ may be tested 
in practice. Neyman & Pearson (1928) took 208 samples, each of size 200, from a population 
of eight groups described by the cubic curve 


y = 25+ 482-3 52°. 


The expectation in each cell for a sample of this size was calculated and the x? criterion found 
for each of the 208 samples. The writer was given access to these calculations and was able 
to find the sampling distribution of 7' from the material. The results of this sampling experi- 
ment and the theoretical distribution of 7' from relations /1) and (2) are given in Table 2. 

The agreement between theory and practice would seem to be reasonably good, and in 
the cases (4, 4) and (5, 3) the values of x?, calculated to test the discrepancy between theory 
and practice, were not greater than might be attributable to sampling fluctuations. It was 
not thought worth while to calculate x? for (6, 2) and (7, 1). A second sampling experiment 
in which samples of size 360 were drawn from a normal population of fifteen groups lent 
further support to the reasonableness of the theoretical distribution. 

10. The 7 criterion will be a useful supplementary criterion to the x*, but because it 
takes account solely of the sign of a distribution and not of its magnitude it will probably 
only be useful when used in conjunction with x”. A test of significance which could combine 
both the probability levels of 7’ and x* would undoubtedly be more useful, and we may 





304 A x? ‘smooth’ test for goodness of fit 


therefore consider how this might be done. Unless the exact degree of dependence which 
exists between two variables is known it is usually only possible to obtain their joint dis- 
tribution if they are independent. It would appear reasonable, both on theoretical grounds 
and from sampling experiments, to assume that 7’ and x? are independent, or, 1f the assump- 


tions underlying both tests are not exactly fulfilled. to assume that the degree of dependence 
between them is at most small. 


Table 2. Comparison of theoretical distribution of T with 
that derived from a sampling experiment 
(4 positive, 4 negative) 





























T = number of sets 2 3 a 5 6 7 8 Total 
Sampling 3 5 20 25 28 5 7 93 
Theory 2-7 80 ~ 23-9 23-9 23-9 8-0 2-7 93 

(5 positive, 3 negative) or (3 positive, 5 negative) 
T = number of sets 2 3 4 5 6 7 Total 
Sampling 2 8 32 30 20 10 102 
Theory 3-6 10-9 29-1 29-1 21-9 7-3 102 














(6 positive, 2 negative) or (2 positive, 6 negative) 








T = number of sets 2 3 4 5 Total 
Sampling 1 3 —_ 5 9 
Theory 0-6 1-9 3-2 3-2 9 














(7 positive, 1 negative) or (1 positive, 7 negative) 








T = number of sets 2 3 Total 
Sampling — 2 2 
Theory 0-5 1-5 2 














11. We shall begin by demonstrating that as far as mathematics are concerned the 7 
and x? criteria are completely independent.* For simplicity of argument let us consider the 
case of three groups only. The sample may then be represented by a point (n,, 2,73) in 
three-dimensioned space, with axes of reference On,, On,, Ons, and the expected population 
values by a point (m,,m,,ms,) in the same space. Since 


Ny +Ngt+Nzs = M,+M,+mM, = N, 


* This method of approach was suggested to me by Andrew Gleason of Harvard University at a 
seminar given at the Statistical Laboratory, University of California, at Berkeley. 








in 


sl 








F. N. Davip 305 


these points are constrained to lie in a plane. Fig. 1 shows this plane for the particular case 
N = 16; m, = 4, m, = 8, m, = 4. Since no frequency can be.negative, possible sample points 
must be within an equilateral triangle lying in this plane, the chance of occurrence associated 
with a point being the multinomial term 


N! fe) My nN =)" 

Ny! Nq! nz! Gy N N} ~- 
When using the x* test the mathematical approximation consists in substituting for this 
term an expression proportional to e—#", in regarding this last as a continuous function, and 






8n, - 
8n2 a 8n; + 8n3 aii 
léns @ fe + 

8n3 pad 


Fig. 1. Graphical illustration of the x* contours and the change in signs of the én’s. 
,, MN, and ns denote the points of intersection of the On,, On,, On, axes with 
the plane n,+n,+n,=N. According to the approximation, the chance equals 
& of obtaining a sample point lying outside the elliptic contour on which x = x,- 


in taking as a measure of goodness of fit the integral of this expression outside the ellipse 
which passes through the sample point and on which x? is constant. For the case of three 
groups this integral itself assumes the simple form e-#*, Three such elliptic contours are 
shown in the diagram. 

Planes through (m,,m,,m,) parallel to the co-ordinate planes N, Ong, N, Ons, Ns On,, will 
intersect the sample plane 


; 
< 


2 +Ngtn, = N 





306 A x? ‘smooth’ test for goodness of fit 


in three straight lines. As shown in the diagram, these lines divide the sample plane into six 
sectors, and for all sample points within a sector the signs of the differences dn; = n;—m, 
will remain unchanged. Any test based solely on runs of signs will consist in taking one or 
more of these sectors as critical regions and rejecting the hypothesis tested when the sample 
point falls therein. It is clear that if we use the mathematical approximation, the distribution 
of x? is the same within each sector; similarly, that the chance of a sample showing a given 
combination of signs is the same on each ellipse along which x? is constant. Thus under the 
assumptions made regarding the distribution of x?, the 7 and x? criteria are completely 
independent. 

In this case of three groups 7' can only assume values of two or three and the former value 
would not be judged significant, but the argument will follow exactly similar lines in the 
case of many groups. The number of sectors will be in general 2(2r,+ 1) if r,>r, and 2(2r,) 
if r, = rz, and they will be bounded by primes passing through the population point. 

12. While the distributions of 7 and x? are independent for this mathematical model 
they are unlikely to be exactly so when we go back to the true multinomial density dis- 
tribution, because the sample space is neither continuous nor infinite. The model, in fact, 
becomes inaccurate if m,, m, or m, are very small. For example, it is seen in Fig. 1 that while 
the 1 % ellipse (7 = Xo9;) lies completely within the triangular space for the sectors with 
signs + — + and — — +, it lies completely without the space for the sector + + — and 
partly without for the other sectors. It has been thought worth while therefore to test 
whether the two criteria are independent in practice, and to this end the same material 
previously described has been utilized. Tables 3 and 4 give the distribution of mean y? 
for different values of P{7} and the distribution of mean P{7} for grouped values of °. 
There is little evidence in these figures to show that P{7} and x? (and therefore P{x?}) are 
related. The figures therefore lend support to the geometrical argument and indicate that 
the approximations involved in y?, both from the small sample and the fact that the sample 
space is not infinite, do not invalidate the mathematical result. 

13. In order to combine the x? and T tests of significance it will be necessary to develop 
a theory for the combination of two tests of significance when one criterion is a continuous 
and the other a discontinuous variable. R. A. Fisher has set out the test for the combination 
of tests of significance from a number of independent continuous variables. The keystone 
of the test is the recognition of the fact that if Z is a continuous variable, then z, where 


c= [ w(ayaz, 


is also a continuous variable equally likely to have any value between 0 and 1; we shall 
describe z as being distributed rectangularly. Twice the logarithm of the product of two such 
z's, say z, and z,, where z, and z, follow from two independent tests of significance can be 
shown to be distributed as y? with four degrees of freedom. Consider a discontinuous variable 
X which may take values X,, X,,...,X, and which has an elementary probability law 
P{X = X;} = p;, where O<p,<1 for j=1,2,...,8 


s 
and = p; = 1. 
j=1 


If a new variable, x, is defined as taking values x,, x, ...,2,, where 
k 
xt, = >> Pj» 
j=1 






































F. N. Davip 


Table 3. Mean x? for different values of P{T} 


307 





























Ta, T; OF Tz, 7; — 4,4 5, 3 4,4 5,3 6, 2 4,4 5, 3 4,4 6, 2 
No. of obs. on which 26 5 20 28 30 — 25 32 20 3 
mean is based 
P{T} 1:00 0-97 0-93 0-86 0-71 0-64 063 043 O37 0-29 
Mean x? 7:19 604 587 643 5-98 — 7-41 7-32 6-82 4-86 
11, fT, OF Tq, 7; 7,1 5,3 4,4 6, 2 5, 3 and 4, 4 
No. of obs. on which — 8 5 1 5 
mean is based 
P{T} 0-25 0-14 O11 0-07 0-04 and 0-03 
Mean x? — 6-10 553 6-95 8-06 














Table 4. Mean P{T} for grouped x* 





x? 


0-0-1-0 


1-0-2-0 2-0-3-0 3-0-4:0 4-0-5-0 


5-0-6-0 6-0-7-0 7-0-8-0 8-0-9-0 9-0-10-0 





No. of obs. on which 
mean is based 
Mean P{T} 





14 


0-71 0-73 0-69 


27 32 29 
0-60 0-68 0-60 





21 14 1l 15 
0-65 0°57 0-67 0-64 











x 


10-0-11-0 11-0—12-0 


12-0-13-0 13-0-14-:0 14-0-15-0 


15-0-16-0 16-0-17-0 17-0-18-0 





No. of obs. on which 
| mean is based 
Mean P{T} 





10 10 
0-74 0-65 


5 
0-74 


4 2 
0-58 0-82 








then x, may only take values between 0 and 1 for k = 1, 2,...,8. It is required to find the 
joint probability law of the product of two independent variables x and z, where x and z 
are as defined above. It will be noted that the elementary probability law of x will be 


P{fe=2xj}=p; (j =1,2,...,8). 


Hence when x = 2; (the probability of which is p;), the product xz will be distributed rect- 
angularly between 0 and x, ona proportion p; of occasions. It follows that xz has a probability 


distribution which has points of discontinuity at 2,, Xs, .. 


angularly between these points of discontinuity, and that 


8 
P{0<az<2,} =p, pS P{a, < 2z< 24} = Pa 
9s 7 


Generally 


8 
Pl, <22<%,} =p, YE. 


j=k v; 


.,%,, that it is distributed rect- 


s 
Py 
x x; 











308 A x? ‘smooth’ test for goodness of fit 


14. If we now apply this theory to the combination of the tests of significance of 7’ and 


x’, it is seen that we must consider the product of P{x*} and P{T7}. x? is a continuous 
variable and 


+0 
em [ip Pac dae) = PQt>xd = PO 
is distributed rectangularly between 0 and 1, and 


z= ¥ P{T nr} = P(T<T} = P{T} 
T=2 


is a discontinuous variable taking known values. The probability integral of xz is thus known 
from theory and Y,.o,; or Y,.., can be found to satisfy the relation 


P{0<2z<Y,} =e. 
These probability levels are given in Table 5. The procedure for the joint test of significance 
will be: . 
(i) calculate P{7} as described in § 7; 


(ii) calculate P{x*} in the usual way. The degrees of freedom will be the number of groups 
minus one; 


(iii) multiply P{7} and P{x?} together and refer to Table 5 to judge the significance of 
the product. 


Table 5. Values of Yoo; and Yyo,, where P{P(x?) P(T') <Y,} =e 


This table may be used to judge the significance of the joint distribution 
of the T' criterion and any other continuous criterion. 














r r; T; Yoos Yo.01 r ry rT; Yo.05 Yo.01 
5 4 1 0-03125 0-00625 ll 10 1 0-0275+ 0-0055+ 
3 2 0-0213 0-0043 9 2 0-0171 0-0034 
8 3 0-0144 0-0028 
6 5 1 0-0300 0-0060 7 4 0-0144 0-0025+ 
4 2 0-0211 0-0042 6 5 0-0140 0-0024 
3 3 0-0195 0-0039 
12 1l 1 0-0273 0-0055- 
7 6 1 0-0292 0-0058 10 2 0-0174 0-0035- 
5 2 0-0197 0-00395 9 3 0-0149 0-0027 
4 3 0-0174 0-0035 8 4 0-0142 0-0024 
7 5 0-0135 0-0022 
x 7 1 0-0286 0-0057 6 6 0-0131 0-0021 
6 2 0-0188 0-0038 
5 3 0-0160 0-0032 13 12 1 0-0271 0-0054 
4 4 0-0153 0-0031 ll 2 0-0165+ 0-0033 
10 3 0-0151 0-0026 
9 8 1 0-0281 0-0056 9 4 0-0138 0-0023 
7 2 0-0180 0-0036 S 5 0-0137 0-0022 
6 3 0-0153 0-0031 7 6 0-0138 0-00225 
5 4 0-0140 0-0028 
14 13 1 0-0269 0-0054 
10 Ht) 1 0-0278 0-0056 12 2 0-0163 0-0033 
8 2 0-0175 0-0035 1l 3 0-0151 0-0025+ 
7 3 0-0143 00029 10 4 0-0135- 0-00225 
6 4 0-0143 0-0026 9 5 0-0138 0-0023 
5 5 0-0143 0-0025 8 6 00136 0-0022 
7 7 0-0134 00022 


















































nd 


ice 





F. N. Davip 309 


15. The application of the joint test of significance may be illustrated by means of an 
example. A sample of 360 observations is available. This sample has actually been randomly 
drawn from a normal population of which the mean is zero and the standard deviatior. unity. 
The figures are given in Table 6. Calculations give x* = 21-1 and P{x%} = 0-10. Judging 
by the x alone we should say probably that there is nothing out of the ordinary in the 
deviations of the sample from fhe expected values. The number of signs is 15, of which 9 
are positive and 6 negative, and these are arranged in six sets. Making the appropriate 
calculations, we have 


P{6 sets | 9 positive; 6 negative} = 825, = 0-175. 
The arrangement of signs will therefore be judged as acceptable. The joint significance of a 
P{x?} = 0-10 and a P{7} = 0-175 is found, by evaluating the joint distribution, to be 0-066. 


Table 6. Sample values. Observed and expected 























Central values | _> 2"! ~t@ | «BR... wht: |. 08... oe 0-0 
and under 

Observation 12 10 18 26 23 42 43 49 
Expectation 9-3 8-6 14-0+ 21-0+ 28-7 . 25-9 41-0+ 43-0- 
Deviation +27 +14 | +40 +5-0 —5-7 +61 +2-0 +6-0 

Central values +03 +06 +09 41-2 +15 +18 +21 | Total 

and over 

Observation 35 28 20 26 20 3 5 360 
Expectation 41-0+ 35-9 28-7 21-0 14-0 8-6 9-3 360 
Deviation —6-0 -—7-9 —8-7 +5-0 +6-0 — 56 —4-3 0 



































16. A study of the basic table (Table 1) of the function 7 will show that P{T} is not a 
very sensitive criterion with which to judge the randomness of a sequence of signs unless 
the number of groups under consideration is very large. For example, if there are 10 signs, 
5 of which are positive and 5 of which are negative, the probability of getting two sets of 
signs is 0-008. Thus the test would show, and rightly, that the chance of such an arrangement 
is small, but this fact would undoubtedly be recognized by a skilled computer without the 
use of a test at all. In the case of 10 signs the probability of three groups or less is 0-040, 
and this would possibly be judged non-significant. Again, let us consider an extreme case say, 
10 signs, 9 of which are positive and 1 negative. The T criterion does not concern itself with 
the fact that the numbers 9 and 1 sre exceptional, it is merely concerned with deciding whether 
their arrangement is exceptional given the 9 and 1. Table 1 shows that neither possible 
arrangement would be consi¢ered out of the ordinary. It is these points of weakness which 
show that the criterion T is not of great utility except in combination with x*. For, if we 
consider the 9 positive, 1 negative case, common sense tells us that the x? criterion in such 








310 A x* ‘smooth’ test for goodness of fit 


a case would possibly be significant. Nine positive deviations have to be balanced by a 
single negative deviation, and this last is therefore likely to be big. This does not influence T'; 
neither will the contribution of 7' to the joint criterion be of much weight. This is as it, should 
be, for it is difficult to see how one can postulate a smooth alternative for 9 positive, 1 nega- 
tive, two sets, and not also for 9 positive, 1 negative, three sets. Generally, however, we 
shall not meet such extreme cases in practice. One way of overcoming this weakness of the 
test would be to consider the probability of obtaining r, positive and r, negative signs together 
with the probability of obtaining T' sets of alternate positive and negative signs given r, and 7. 
This is simple enough when considering just a sequence of alternatives, as I have shown 
elsewhere, but it is not easy to fit these results to the x? problem, nor, when this is possible, 
will the choice of a critical region be straightforward. However, the results of sampling 
experiments will be utilized to throw light on these points and it is hoped to discuss them, 
with other questions arising, in a further publication. 

17. It is possible that there are other criteria, depending on the arrangement of positive 
and negative signs, which will be more sensitive than the 7' criterion chosen. For example, 
it is easy to calculate, given r, and r,, the probability that the largest set is composed of a 
sequence of r’ positive signs, and there are many other possibilities which might be con- 
sidered. It would appear that any criterion based on sign sequences can be shown to be 
independent of x? by means of geometrical argument, and it will be necessary therefore to 
consider the power of these different sign tests when referred to a specified set of alternate 
hypotheses. 

18. The main objection to the two criteria, 7’ and P{x*}. P{7}, that I have proposed in 
this note is the one which was mentioned earlier; they are only applicable to the case where 
there is just one restriction on x’, i.e. when the totals of expected and observed frequencies 
have been made to agree. It is possible to work out a slightly different form of the 7 critezion 
for each additional restriction which is put on x*, and this has been done. It is preferable, 
however, to delay publication until the results of an extensive sampling experiment are 
complete in order to verify whether such theoretical assumptions as have been made are 


reasonable. 
REFERENCES 


NEyMAN, J. (1937). Skand. Aktuar. Tidskr. 20, 149-99. 
Neyman, J. & Pearson, E. S. (1928). Biometrika, 20 A, 263-94. 














[ 311 ] 


AN EXACT TEST FOR THE EQUALITY OF VARIANCES* 
By R. L. PLACKETT, M.A. 


INTRODUCTION 


The problem of testing the equality of variances and covariances in normal distributions 
is one which has received considerable attention; we have compiled a bibliography of some 
sixty papers, and shall issue a survey of these in due course; only papers vital to our discussion 
will be considered here. A precise instance of the type of situation we are considering is as 
follows: measurements of height, span and tibia length are made on each of 20 Englishmen, 
20 Scotsmen, 20 Welshmen and 20 Irishmen; it is required to know if the covariance matrix 
of the three characteristics is the same for each of the four nationalities. Nothing is known 
or assumed about the mean values of these characteristics in the four populations con- 
sidered, nor are we interested in testing any hypothesis concerning the means, although 
such a hypothesis may be the object of further investigations which assume that the four 
covariance matrices are the same; this latter assumption is pape) made in multivariate 
analysis of variance. 

Wilks (1932) has already given the moments of the distribution of his criterion for testing 
the equality of several covariance matrices (on the hypothesis that the matrices are in fact 
equal) and Bishop (1939) put this criterion into an approximate workable shape. The test 
criterion given here differs from that of Wilks and has the advantage when one or two cor- 
related characteristics are being measured (height or height and span, for example) that its 
distribution is exactly known whatever the number of populations. Nair (1939) did, it is 
true, give the exact distribution of the Neyman & Pearson (1931) L, criterion for one mea- 
sured characteristic; and the exact distribution for two characteristics of Wilks’s generaliza- 
tion of their criterion; but the form in which the distribution was obtained is very involved. 
It is interesting to notice that from our standpoint the problem of testing the equality of 
several variances (i.e. the case of one measured characteristic) is, as will appear, brought 
within the framework of multiple correlation theory. In the general case of more than two 


characteristics the moments of the distribution of our criterion, like those of Wilks, are 
available. 


OUTLINE OF METHOD 


In the usual terminology we consider k p-variate normal distributions and are concerned 
with testing the hypothesis that the corresponding variances and covariances are all equal. 
The method we employ to test this hypothesis is essentially that which has been in use in 
analysis of variance since its origination by Fisher; to test the equality of a set of k quantities 
we test whether (k— 1) orthogonal linear functions of the quantities are each zero. To illu- 
strate the application of this principle in the present instance take the particular case p = 1, 
i.e. we wish to test the equality of the variances in ¢ univariate normal distributions. If 
a typical observation from the [th distribution is ¢, (! = 1,2, ...,&), form & mutually ortho- 
gonal linear functions of the ¢, such that one is 


“= ty ttt... +e,. 


* Communication from the National Physical Laboratory. 


Biometrika 34 


21 








312 An exact test for the equality of variances 


If the (k—1) covariances of u and each of the other linear functions are all zero then the 
variances of the k distributions must all be equal; this condition may be expressed by saying 
that the multiple correlation coefficient of w on the other linear functions is zero. Further, 
if there are n sample values of wu then the size of sample drawn from each distribution must 
also be n at least, and if no observations are to be discarded the size of each sample must be 
n exactly. Thus, although it is not a condition of the problem that the sizes of samples drawn 
from the k distributions must all be equal, it is a condition of our solution. 

The extension of the foregoing principle to p> 1 is straightforward and is considered in 
detail in the next section; the problem then becomes that of testing the independence of two 
groups of variates, the first of size p, i.e. p expressions of the form w; and the second of size 
p(k— 1) comprising all the other orthogonal linear functions. This problem has been treated 
by Wilks (1935, 1943) and the relevant distribution is expressible as an incomplete £-function 
when p = | and 2 (for all &); an exact distribution is also known when p = 3 and 4 for k = 2. 
Finally, since when p = 1 the criterion has the form of a multiple correlation coefficient, the 
power of the test in this instance can be calculated by virtue of the work of Fisher (1928). 


DIscUSSION OF THE TEST 


A sample of n observations is drawn from each of the k p-variate normal distributions of 
which the Ith has the covariance matrix Vj, (J = 1,2, ...,k; ¢,j = 1,2,...,p). It is required 
to test the hypothesis that 

Vi = VG (l,m =1,2,...,k). (1) 


The population means do not enter into the hypothesis and have arbitrary unknown values. 
Where i, j, 1, m appear henceforth they will be understood to range over the values given 
above unless otherwise stated. The observations may be written in the form of an n x kp 
matrix X such that all those on the ith variate in the [th distribution are in column (i — 1)k +1. 
The ath observation in this column (a = 1, 2, ...,) is denoted by 24, ; the order of the elements 
in a column is assumed to be random. If this is doubted the observations should be randomly 
rearranged. 

We must emphasize here that the sample value of the criterion to be used to test (1) 
depends on this order, and there is thus, in a sense, a correspondence between 2}, and jt, 
although these two quantities are, of course, uncorrelated when 1+m. Most tests of a 
hypothesis specifying nothing about the order in which observations are made or written 
down are themselves independent of it; ours is not, and different computers with the same 
data might well come to different conclusions although this does not affect the validity of 
the test, the significance level being overall what it should be. There is probably some loss 
of power which can, however, be offset by imbuing a with a certain physical meaning; but 
we shall not discuss this question here. A criterion for testing normality depending on the 
order of arrangement of observations has been suggested by R. C. Geary (1935, pp. 316-17). 


Let now the = the (Zt,) [n, (2) 


and let the corresponding n x kp matrix be Z. If G = Z’Z, where a prime is used to denote 
the transpose of a matrix, then, apart from a factor n, G@ is the matrix of sample variances 
and covariances of all variables. We further define S(k, p) as the sum of all (k?”) signed minors 








whe 


The 


Wh 


Put 
and 
we 


so 1 


He 
equ 





crv ws ww 


_ Ww 


/_ F — OF Ve 





R. L. PLAcKETT 313 
formed by rows 1,, /,, ..., 1, and columns m,, mg, ..., m, of G, where 
((—1)k<l,, m,;<tk. (3) 
§(k, p) is similarly defined for the matrix @ = G—1 (we shall use this notation for the inverses 
of matrices throughout). 
We now proceed to prove the following 
THEOREM: W(k, p) = ?/S(k, p) S(k, p) 


is distributed like Wilks’s statistic for testing the hypothesis that two groups of variates 
of sizes p and »(k— 1), known to have been drawn from a (kp)-variate normal distribution, 
are mutually independent (Wilks, 1935, 1943). If the groups are in fact mutually independent 
then (1) is true. 


Proof. Introduce a k x k orthogonal matrix B, the elements of whose first column are all 
equal (to + 1/,/k) but which is otherwise quite arbitrary. Put 
r=(t-1)k+l, u=(m—-1)p+j, (4) 
and form a kp x kp matrix A such that 
Oy, = 9:5 bm; (5) 


where 6;; = 1 (¢ =j), otherwise O. Clearly A is also orthogonal. For example, suppose 
k = 4, p = 2. Apart from a factor of + } multiplving each element, let 








B=j]1 1 1 1 
1 1-1 — 
1-1 1- 
1-1 -l 
Then Beek 8° 2.58 0 
; 8 I #6.) 9-2-8 
Ye => Be 2 i= 
C9 =f. °o'=2. 2.2 ae 
. s 8©- Petesot: Bs 
0.2. -@: 2.8 =i 1. 8k 
01 0-1 0 1 0-1 
CS =e. Se Ss. 
When p = 1, A = B. Let 
D=XA, Y=ZA, C=Y'Y=A'GA. (6) 
Putting s=(j-—l)k+m, t= (l-—1)p+i, (7) 
and defining t’ =(l—1)p+t, uw’ =(m—-1)pt+j (lm=2,3,...,h), (8) 
we have Era) = Sin(®— 1) Vi,, (9) 
so that E (C4) = (n—1) = bin Vig /k (10) 
Hence when E (Cx) = 0 (11) 


equations (1) are satisfied, because for fixed i and j equations (10) can be solved and yield 
(n—1) Vi; = &(e,,). (12) 








314 An exact test for the equality of variances 


Denote a typical element of the tth column of D by d,. Then equations (11) are satisfied if 
and only if d; and d,, are mutually independent. 


A criterion for testing (11), obtained by likelihood-ratio methods, has been given by Wilks 
(1935, 1943). This is 


W(k,p) = lel, 13 
w= Te Tere = 
and is sometimes called the vector alienation coefficient. Let C® be the pth compound of C’ 


(Aitken, 1939, p. 90), i.e. the matrix of all p x p minors of C; and C the pth compound of 
O = C— (since 6 is the inverse of C® our notation is consistent). Then 


W(k, p) = 1/ee# (14) 


by an application of Jacobi’s theorem on the minors of the adjugate (Aitken, 1939, p. 97). 
Now by the Binet-Cauchy theorem (Aitken, 1939, p. 93), 


C® = (A'\)G® AM, CO = (A's) G A), (15) 
Consider the elements in the first row of (A’). The first rows of A’ are of the form 


11...1 00...0 00...0 ... 00...0 
00...0 11...1 00...0 ... 00...0 
00...0 00...0 ll...1 ... 00...0 
00...0 00...0 00...0 .. lM... 


apart from the factor + 1/,/k multiplying each element. Therefore the only non-zero elements 
in the first row of (A’)® are those formed by taking one column from each of the p blocks 
of & columns into which the first » rows of A’ may be divided. All the non-zero elements 
equal k-+?, Then from (15) 


of) = S(k,p)k-?, of = Sk,p)k, (16) 
so finally W(k, p) = ?/S(k, p) S(k, p). (17) 
This completes the proof. 
Case of p= 1 
Here W(k, 1) = /S(k, 1) Sk, 1), 


where S(k,1), S(k,1) are the sums of. all elements of G, G- respectively. If (1) is true, 
W(k, 1), the true value of W(k, 1), is unity. Define 


W(k,1)=1—R? and W(k,1) =1-—R?, (18) 


so that if (1) is true, R = 0. The distribution of R® = 1— W(k, 1) when R = 0 is, as Wilks 
pointed out, well known, being that of the multiple correlation coefficient (of d, ond,, dg, ...,d,); 
if in the usual notation 


La, b) = [B(a, 6) [=a ~2)1dx, (19) 


then the cumulative distribution function of x = R? is I,(k—1,n—k), values near 1 being 
significant; that of = W(k,1) being I,(n—k,k—1) with small values significant. Tables 


in convenient form have been calculated by Thompson (1941); otherwise we can convert 
to the variance-ratio F by 


F = (n—k)(1—W)/(k-1) W. (20) 








tr 





ts 
KS 
ts 


6) 
7) 





R. L. PLackett 315 


It is clear that n must exceed k; for p variates, n exceeds pk in order that G may be non- 
singular. 

If the matrix A is defined instead as a kp x kp orthogonal matrix, the elements of whose 
first column are all equal (cf. equation (5)), the problem is effectively reduced to the case 
p = 1 whatever the value of p, and we can test exactly the somewhat indefinite hypotheses 


Vit Vigt...+ Vi, = VR+VR+...4+VR. (21) 

This may be applied in the following manner, for take k = p = 2 and obtain 

Vint Vig = Viyt Vig = Vi, + Vis = V3.4 Vie (22). 
Thus Vi, = Vi, and V3, = Vi. (23) 
If it is assumed Vi, = V3, (24) 
then Vi, = Vi, (25) 
and conversely. 

Case of p = 2 


The distribution of W(k, 2) has been given by Wilks (1935). Ifz = ./[W(k, 2)], the cumula- 
tive distribution function of x is 
I,(n—2k, 2k—2). ; (26) 


Small values of z are significant and n must exceed 2k. 


Case of p>3 


For k = 2, p = 3 and 4, the exact distributions are again known and *.ave been given by 
Wilks in equations (35) and (37) respectively of his 1935 paper. The expressions are rather 
complicated and we have not reproduced them here. For other values of k and p the moments 
of W(k,p) are available; while more recently Wald & Brookner (1941) have obtained the 
distribution in the form of an infinite series, calculating numerical values for the coefficients 
in certain instances. 

For p> 1, (17) becomes rather intractable as a means of calculating W(k, p). Indeed, 
for k = 2 and p = 4 it is necessary 

(i) to calculate 36 sample variances and covariances, 
(ii) find the inverse of an 8 x 8 matrix, 
(iii) calculate 512 4 x 4 determinants, 
and it is clearly better to reintroduce the matrix A in some appropriate numerical form, 
calculate Y = ZA and C = Y'Y, and find W(2, 4) from (13), a process which involves the 
evaluation of an 8 x 8 and two 4 x 4 determinants. 


POWER OF THE TEST WHEN p=1 
From (17) the true value of R? is in general given by 


rnemae (Eri) (Bure). mn 


and thus the test will have equal power for all values of the variances such that the product 
of their sum and the sum of their reciprocals is constant. Consequently 1 — W(k, 1) is dis- 
tributed like the multiple-correlation coefficient in samples from a population where the 








316 An exact test for the equality of variances 


true value is given by (27). The probability density function of this distribution has been 
deduced by Fisher (1928) and can be integrated to give a finite series when (n — k) is even. 


We find easily when k = 2 that in the V},, V3, quarter-plane the equipotentials are pairs 
of lines 


Vii=aVi, and aVj, = Vij, (28) 
where a = (1+ R)/(1—R). (29) 
For k>2 the equipotential surfaces in k dimensions are cones through the origin situated 
symmetrically with regard to the co-ordinate primes. 
Reverting to k = 2 three methods are available for testing the hypothesis that Vi, = V3,: 
(i) Fisher’s z or F = exp (2z) 
= 911/922- (30) 


(ii) the L, criterion introduced by Pearson & Neyman (1930) and later extended to 
k>2 (Neyman & Pearson, 1931). 


In the instance we are considering, i.e. equal sample sizes from both populations, 


Ly = 2(941929)*/(911 + Yea) (31) 
= 2Ft/(1+ F). -(31la) 
(iii) W (2,1) = 4[911992— (912)")/[(911 +922)" — (2912)").* (32) 


Thus tests (i) and (ii) are exactly equivalent, as is known, the optimum critical region being 
that corresponding to equal tails of the F-distribution. Criterion (iii) is that obtained by 
Morgan (1939) and Pitman (1939), appearing as equation (12) in Morgan’s paper, to test 
that the variances in a normal bivarate population are equal. Morgan has compared the 
powers of tests (i) and (iii) for m = 12, 25 and 100 at a significance level of 0-10 and for these 
sample sizes it appears that the tests are effectively of equal power. 
When 1 is large and, consequently, the two populations being independent, (9,.)"/91: 922 
is converging in probability to zero, 
W(2,1)~ TA. (33) 


The cumulative distribution functions of criteria (ii) and (iii) are respectively 1,(*>~5) 
(Nayer, 1936) (2 = ZL?) and LF. 5) (x = W(2,1)). Generally, W(k,1) for large n con- 
verges in probability to the harmonic mean of the sample variances divided by their arith- 


metic mean; L, (for equal sample sizes) is exactly equal to the ge metric mean divided by 
the arithmetic mean. 


EXAMPLE OF THE USE OF THE TEST FOR A CASE WITH k=4, p=1 


It is not easy to calculate W(k, 1) from equation (17) if k> 3. Indeed, the main value of (17) 


lies in showing the form of solution, and in establishing that this is independent of the - 


particular orthogonal transformations used. In the following example, therefore, orthogonal 
transformations are made at once and the multiple correlation coefficient is calculated from 


the numerical data. This procedure is far quicker than that involved in calculating W(4, 1) 
from (17). 


* See Appendix. 














Fo 


TI 











R. L. PLACKETT 317 
Below are given samples of 10 from each of four univariate normal populations: 

x Zs 2s X a Zs 2s Ee 

— 20 +24 + 4 +52 + +15 + 8 - 8 
- 1 +18 +9 — 24 + 5 +24 -— il +56 
-l11 +27 —27 0 +18 —12 +1 — 64 
+10 +21 + 5 +48 +13 — 24 -—4 +12 
— 4 — 48 - 3 +48 6 +12 + 5 —12 
































which have mean zero and standard deviations respectively 10, 30, 10, 40. Make the fol- 
lowing orthogonal transformation: 


Yy = XU+X%otX%etX,, Yq = %+%X%_—X3—Xq, 


Yg = Xy—AUyt%y—X%y, Yq = V1 —Xe—MetX, 








and obtain 
WY Ye Ys Ya Y Ys ¥s Ys 
+60 —52 —92 +4 +22 +22 +8 24 
+ 2 +32 +14 —52 +84 —26 —%6 +38 
-il +43 —65 ak — 657 +69 +95 —57 
+ 84 — 22 — 54 +32 - 3 -—19 +21 +53 
-—7 —97 3 +95 mar +13 ~—- —35 



































Form the matrix of sums of squares and cross-products, i.e. C. This is 


+18636-1 — 9646-9 -—18232-°9 + 7325-1 
— 9646-9 +21784:1 +12018:1 —19015-9 
— 18232-9 +12018-1 +28692-1 — 9445-9 
+ 7325:1 -—19015:9 — 9445-9 +22008-1 


The matrix of sample correlation coefficients is therefore 


1 —0-4788 -—0-7885 +0-3617 
— 0-4788 1 +0-4807 —0-8685 
—0-7885 +0-4807 1 — 0°3759 
+0°3617 —0-8685 —0-3759 1 


Hence the multiple correlation coefficient of y, on ¥9, Ys, ¥,is given by R*? = 0-637. Calculated 
by the approximation indicated in the last paragraph of the preceding section* R? = 0-656; 
the true value obtained from equation (27) with variances in the ratio 1:3:1:4 is 0-727. 
The upper 10 and 5% levels of significance, obtained from Thompson’s tables with 
vy, = n—k=6, vy» = k—1 = 3, are respectively 0-622 and 0-704. We find ZL, = 0-565, the 
5 and 1 % levels obtained from Nayer’s (1936) tables being respectively 0-797 and 0-719, 
so that this test gives a more significant result than the one based on R?. The relative merits 
of L, and the test we have provided, which cannot be judged on the results of one example, 
remain a problem to be investigated. 


* T.e. calculated from 1— (harmonic mean of g,,)/(arithmetic mean of g;;). 








318 An exact test for the equality of variances 


SUMMARY 


An exact test has been put forward for the equality of variances and covariances in any 
number k of 1- or 2-variate normal populations; the test is also exact for two 3- or 4-variate 
populations; but is restricted in application to equal sample sizes n from the k populations 
where n exceeds pk, p being the number of variates. The moments of the criterion are avail- 
able for k p-variate populations where the statistic used is equivalent to that employed by 
Wilks (1935) to test the independence of two groups of variates (of sizes p and p(k—1)), and 
has the same distribution. In the univariate case the power of the test is known as a function 
of one parameter. Comparison with the L, criterion has already been made when p = 1 


and k = 2, the tests being practically the same, and an example worked out of the use of the 
test when p = 1. 


Our thanks are due to E. C. Fieller for drawing our attention to the papers by Morgan and 
Pitman and suggesting that the test given there for the equality of two variances might be 
extended to more than two; also to Prof. E. 8. Pearson for pointing out the need of certain 
explanatory additions. 

The work described above has been carried out as part of the research programme of the 


National Physical Laboratory, and this: paper is published by permission of the Director 
of the Laboratory. 


REFERENCES 


Arrxen, A. C. (1939). Determinants and Matrices. Oliver and Boyd, Edinburgh. 
BisHop, D. J. (1939). On a comprehensive test of the homogeneity of variances and covariances in 
multivariate problems. Biometrika, 31, 31. 


Fisuer, R. A. (1928). The general sampling distribution of the multiple correlation coefficient. Proc. 
Roy. Soc. A, 121, 654. 

Geary, R. C. (1935). The ratio of the mean deviation to the standard deviation as a test of normality. 
Biometrika; 27, 310. 

Moraan, W. A. (1939). A test for the significance of the difference between the two variances in a 
sample from a normal bivariate population. Biometrika, 31, 13. 

Narr, U.S. (1939). The application of the moment function in the study of distribution laws in statistics. 
Biometrika, 30, 274. 

Naver, P. P. N. (1936). An investigation into the application of Neyman and Pearson’s L, test, with 
tables of percentage limits. Statist. Res. Mem. 1, 38. 

Neyman, J. & Pearson, E. S. (1931). On the problem of k samples. Bull. int. Acad. Cracovie, Série A, 
p. 460. 

Pearson, E. 8S. & Neyman, J. (1930). On the problem of two samples. Bull. int. Acad. Cracovie, 
Série A, p. 73. 

Prrman, E. J. G. (1939). A note on normal correlation. Biometrika, 31, 9. 


Tuompson, C. M. (1941). Tables of percentage points of the incomplete beta-function. Biometrika, 
32, 168. 


Waup,A. & BRooxneR, R. J. (1941). On the distribution of Wilks’ statistic for testing the independence 
of several groups of variates. Ann. Math. Statist. 12, 137. 


Wus, 8. 8..(1932). Certain generalizations in the analysis of variance. Biometrika, 24, 471. 


Wrxs, 8. 8. (1935). On the independence of k sets of normally distributed statistical variables. . 


Econometrika, 3, 309. 
Wus, 8S. S. (1943). Mathematical Statistics. Princeton University Press. 








S(k 





é, 


a, 





R. L. PLAcKETT 319 


APPENDIX 


As an illustration of the algebraic form of W(k, 1) the Editor has suggested to me that it 
might be helpful to show the relation of the general formula (17) to the matrix G@ in this 
simple case when k = 2. Here, using a common notation for a sample mean 


ou = = (zj.—74-)*, Joe = 4 (x3,-27.)7, Iie = = (x4. — 24.) (ei. —24.), 


= ~~ at G= few ~ | oud -9t) 
S(2,1) = 9r+9eet+ 212, (2,1) = poet ssf 


Whence, using (17), (32) is at once obtained for W(2,1). For k>2 the full expression for 
S(k, 1) in terms of the g’s is complicated and the matrix notation becomes essential. 








[ 320 ] 


THE ESTIMATION FROM INDIVIDUAL RECORDS OF THE 
RELATIONSHIP BETWEEN DOSE AND QUANTAL RESPONSE 


By D. J. FINNEY 
Lecturer in the Design and Analysis of Scientific Experiment, University of Oxford 


1. INTRODUCTION - 


A type of biometric problem frequently encountered by the statistician is that which 
requires the estimation and study of a relationship between dose and response. ‘Dose’ is 
here a general term indicating the magnitude of a stimulus applied to certain test subjects, 
and ‘response’ is a measure of the effect which the stimulus produces on the subjects. When 
the test subjects are living matter, whether plants, animals or bacteria, pieces of tissue or 
single cells, the response to a specified dose is unlikely to be constant in repeated trials, and 
regression methods must be used in the estimation of the relationship. 

In some classes of data, the response is ‘all-or-nothing’ or quantal, and cannot be measured 
quantitatively. Ordinary regression methods are then no longer applicable; methods based 
on the transformation of the proportion of subjects showing the response at any dose level 
to the normal equivalent deviate (Gaddum, 1933), or to the probit (Bliss, 19344, b), however, 
have proved very powerful for simplifying the statistical analysis. In recent years, full 
accounts of the underlying theory of these transformations, and of their application, have 
been published by various authors (see, for example, Bliss, 1935a,6; Finney, 1947, 1948). 
An additional difficulty sometimes found is that the intensity of the stimulus cannot be 
selected in advance of a test, but can only be measured after the test has taken place; only 
rarely will two or more subjects happen to receive exactly the same dose, and more usually 
the records consist of a list of doses with, for each, a statement of whether a single subject 
receiving that dose responded or not.* For example, in some methods for the testing of 
insecticidal potency, poison bait is offered to individual insects; the dose received by any 
insect cannot be specified in advance, and must instead be measured as the amount of poison 
ingested. 

Data from experiments of this kind do not give empirical values for the proportion of 
subjects responding at each dose level, except in the trivial sense that every dose shows either 
zero or 100 % responding. Nevertheless, as Bliss (1938) has pointed out, the probit method 
can still be applied to estimation of the dose-response relationship. He has given a numerical 
example, though without showing full details of the working, but has admitted that assess- 
ment of the error of estimation presents some theoretical difficulties (Finney, 1947, § 43). 
An interesting example of experimental results requiring this type of analysis has recently 
been brought to the notice of the writer by Mr R. W. Gilliatt. These introduce an additional 
complication, since the dose is expressed in terms of two measurements, and a probit plane 
(Finney, 1943) or other bivariate regression function must therefore be estimated. An 
account of the analysis, with computational details, may help those who have encountered 
analogous problems in biological or other investigations. 


* When response does not involve death or serious alteration of the test subject, one subject may be 
used many times; the example discussed in this paper is an instance. The form of the data will be the same, 


though the interpretation may require that tolerance variation between and within subjects be dis- 
tinguished. 














D. J. Fiswry 321 


2. THE DATA 
Research in human physiology has demonstrated that, under carefully controlled experi- 
mental conditions, a transient reflex vaso-constriction in the skin of the digits may follow 
a single deep breath (Bolton, Carmichael & Stiirup, 1936). Gilliatt (1947) has found that the 
response depends in part on the volume of air taken in by the subject. Plethysmographic 
measurement of the volume changes in a finger was used to indicate the occurrence of a 
response, but assessment of the degree of vaso-constriction, in order to relate this to the 
inspiratory stimulus, was not practicable. Thus the records obtained for each test show only 


¢-0r- 


3-Oe 


Rate of inspiration (litres per sec.) 











0 10 2:0 3-0 
Volume of inspiration (litres) 


Fig. 1. Contours of dose-response surface for 0-1, 0-25, 0-5, 0-75 and 0-9 frequency of response, 
estimated from three-parameter equation. O no vaso-constriction; @ vaso-constriction. 


the volume of air inspired, the average rate of inspiration, and whether or not vaso-con- 
striction was produced. The above brief outline is sufficient for appreciation of the statistical 
problem, but a full account of the experimental procedure may be found in Gilliatt’s paper; 
the results discussed here are presented in his Fig. 5. 

The data, which Mr Gilliatt has kindly made available to the writer, were obtained from 
thirty-nine tests, in twenty of which vaso-constriction occurred. Tests were made on three 
different subjects, nine on D.W., eight on V.P.W., and twenty-two on S.J.8.; the results of 
the tests, with the subjects in this order, are shown in Table 1. In Fig. 1 are shown the 
thirty-nine combinations of volume in litres (V) and rate of inspiration in litres per second 





























2s ae SOs Ss g& nm @ & SO & a oa > > 5 *aeP saa & BD 
Sescoee PeaSBESERSO2 SE ao & “a5 88 ESO 2 8B253BR S88 86GB 
WEST SO Sr Syoe8r eee ORTEITT SLOTS TT 
9seo8s-EPr SEORSI-F6 6868E6L-6L EBEEBL-BS SEM8EE-ST 909982-91 
sfimg Pang ang garmg Szlang gang 
8LIE-9 $8F6-1 9650-1 suBe, 
PI66-PL GLES-LT 0866-41 62-FL 98 sTe40L 
999-0 0+08-E 0931-0 0999-0 8-9 09-0 v9 + 1@-T Itt 929-1 eI 
£FZ-0 60L8-T ¥8L9-0 5999-0 €9-E €3-0 ev ~ 83-1 88-0 6-1 SL-0 
687-0 0007-3 6618-0 3L29°0 SLE 49-0 0-¢ - 82-1 86-0 6-1 96-0 
908-0 0993°E 0092-0 0097-0 £99 03-0 8-¢g + 6ST 06-0 €t-€ 8-0 
9LL-0 0992-E 0099-0 00%3-0 €9-9 03-0 8-9 <9 08-1 80-1 03% oT 
LOL-0 1639-€ GOIL-O 3199-0 LY9 €3-0 Lg + ve-1 0-1 a3 IT 
009-0 9EFE-S 8861-0 og99-0 6L-S £9-0 ag - 92-1 0-1 £8-T iT 
10-0 691-0 3840-0 S8S1-0 [FZ 60-0 8-3 ~ 8F-0 LEé-1 £0-0 bt 
LZL‘0 S969-E 8267-0 8008-0 ovo 99-0 9-9 + 88-0 aa | SL-0 Lz 
900-0 280-0 0420-0 0840-0 90-3 40-0 vZ om 09-0 02-1 5-0 oT 
LLG-0 996-E PLI9-0 4908-0 83-9 £9-0 3-9 + 86-0 82-1 96-0 6-1 
687-0 0007-3 6618-0 129-0 SL-e 59-0 0-¢ - 8e1 86-0 6-1 96-0 
68-0 319-S 998F-0 399F-0 6L°9 Le-0 3-9 - 8i-T 96°T oT 8-1 
920-0 bSEE-0 pesto PvIOL-O 89°3 eto 0-€ i 8t-T 8L-0 GT 9°0 
L68-0 PEEES 0824-0 0807-0 98-9 ¥t-0 £-9 + 92-1 03-1 SLT oT 
399-0 O3LT-S 0819-0 0801-0 29°E 09-0 Lae I - ett 8T-l 9&-T 1 
e9-0 6988S 6IIL-0 6IIL-0 PL £9-0 Tg - el el S8-1 9é-T 
6LI-0 LZ09°T Ttés-0 909F-0 1¥-€ L¥-0 UP -~ et-T 86-0 98-1 96-0 
L00-0 seelo 0820-0 0980-0 £3°3 90-0 93 * 0€-T 09-0 03 v0 
896-0 Lg0g-T 9492-0 9492-0 LU-L 130 L-9 + 92-1 93-1 8-1 8-1 
€99-0 9EPE-S 6879-0 6FLL-0 GLE £9°0 ag » £0-1 &-1 90-T LT 
821-0 O8FI-E 0097-0 OZLE-0 L8-L 0F-0 6-€ + ST-T £6-0 SIFT 98-0 
666-0 0€80-0 0210-0 TS10-0 088 0-0 08 > 03-1 I9-T 9-1 oe 
286°0 1928-0 Teet-0 9671-0 Ig-L 1-0 TL > 1é-T 98-T 59-1 3 
8F8-0 b086-S 8069-0 SL8E-0 99°9 vF-0 0-9 + Lo-T 88-0 SLE “SL-0 
096-0 9TOL-T 888-0 09L2-0 60°L 93-0 9-9 5 i LEI SIT £82 de | 
08F-0 9EFE-S 286-0 vI6F-0 GLE £9-0 8-5 ” SFT 8L°0 0-€ ¥0 
8FZ-0 60L8-T SE9L°0 268-0 €9-E €9-0 4 ~ bP: 1 rL-0 SL3 g¢-0 
000-0 0000-0 0000-0 0000-0 PIT 00-0 wT - 91-0 06-0 Lg-0 8-0 
000-0 0000-0 0000-0 0000-0 91-0 00-0 OT 99-0 6-0 SF-0 6-0 
900-0 ¥Z80°0 3980-0 08£0-0 90°3 40-0 a 2 88-0 96-0 SL-0 6-0 
0e9-0 OSES 6FLL‘0 2999-0 Le £9-0 Tg - €3-1 v0-T LT TT 
000-0 0000-0 0000-0 0000-0 90-T 00-0 eI * 88-0 8L-0 gL-0 9-0 
I¥L-0 Z969-E 9298-0 09L5-0 oho 93-0 9g > b9-T "98-0 9-€ L0 
681-0 0893-E OSgL-0 0097-0 €9-9 03-0 8-9 - Ig-T 06-0 as + 8-0 
860-0 86F0-E Z10F-0 5662-0 L6°8 rE-0 Le -+ SIT 88-0 G1 gL-0 
T&6-0 968-1 0818-0 0L62-0 3O°L 42-0 9-9 a OFT Ol 9-3 93-1 
686-0 PFI9-0 3£80-0 SES1-0 89-L 80-0 €-L - b0-T vs-T 60-1 gE 
396-0 890€-T 9991-0 9282-0 93°L 81-0 8-9 > 36-0 Lg-T 928-0 Le 
(x) (A) 
d fin txm tam hk n P esuodseyy Sz "2 ose aod 8e14t] UT 
ap euInjo, 
Ur O38Y 












































suounnao fo spmep pun nop ypruaunsadxg “| e148], 





Ptr rr 





Sway Swy* 
94-125032 433-586326 


Sway 
79-738989 


Swa,? 
22-782383 


Swx,2, 
18-338532 


Be 


_ Swaz,e 
16-235606 
pe. ts Sete 





D. J. Frinnry 323 


(R), together with indications of whether or not the subject responded under these conditions. 
Inspection of Fig. 1 shows that, in general, when both V and R were small no response 
occurred, when either was large (unless the other was very small) the response occurred, and 
in an intermediate region the proportion of responses increased as either V or R increased. 
There was no sharply defined threshold separating combinations of V and R giving the 
response from those giving no response; instead, there appeared to be a probability of response 
ranging from practical certainty under some conditions to zero under others. 

As an aid to fuller understanding of the influence of breathing on vasc-constriction, ex- 
amination of the relationship between V , R, and the probability of response seemed desirable. 
Since so few observations were available for each subject, the data were unlikely to be suffi- 
cient to show differences between subjects; this point is discussed later, but in the main 
analysis the distinction between subjects is ignored. For any form of response assessment, 
the testing of one subject many times must introduce a danger that the result-of one test 
will be affected not only by its own stimulus but by preceding stimuli and by the effects they 
produced. In this investigation, each subject was given a number of preliminary tests until 
he appeared to have settled into the routine. The observations recorded in Table 1 were 
obtained after these preliminary trials; they are tabulated in the order of testing, and show 
no indication of effects of previous history, but clearly such effects would havé to be very 
pronounced if they were to be detectable on this amount of data.. 


3. METHOD OF ANALYSIS 


Preliminary examination of the data suggested that the occurrence of a response was largely 
determined by the magnitude of VR, the product of volume and rate, curves on which the 
probability of response has a constant value being approximately hyperbolae of the form 


VR = constant. (1) 


A little consideration shows that an equation of this type is more reasonable than an equation 
linear in V and R, though the data are almost certainly inadequate for discriminating between 
many alternative types of relationship that might be postulated. A system of curves similar 
to, but rather more general than, equation (1), namely, 

VAR’: = constant, (2) 
was selected for trial; this equation may alternatively be regarded as representing a series 
of parallel linear relationships 

B, log V + £, log R = constant (2a) 
between the logarithms of volume and rate for a fixed probability of response. 

A specified combination of V and R will not necessarily always give the same result 
(response or no response) with a subject, for, even though the subject is unaltered, minor 
uncontrolled variations in his environment may affect his susceptibility to the applied 
stimulus. For a particular value of V, the threshold value of R (the value which under the 
conditions prevailing at any instant would be just sufficient to produce a response) will 
have a frequency distribution; similarly, for a particular R, there will be a frequency dis- 
tribution of threshold values of V. If these distributions may be taken as normal in log V 
and log R, and, for simplicity, they are supposed to be such that the mean of either logarithm 
is linearly related to the selected value of the other, then the probability of response will be 
determined by an expression of the form 


B, log V + Bylog R, 











324 Individual records of dose and quantal response 


and the threshold values of this quantity will be normally distributed. If x, and 2, are 
written for log (10V) and log (10R) respectively (the factor of 10 is introduced in order to 
make 2, and 2, always positive), this statement enables the probability of response, P, to 
be expressed as emp 1 


rae Jn) ew? du, (3) 


where a, #, and f, are parameters to be estimated from the data. The estimation may be 
regarded as the fitting of a probit regression plane, for Y, the probit of P being given by 


Y = 5+a+f,%,+ Py. (4) 


Substitution of the value of Y corresponding to a specified probability gives the required 
linear relationship, equation (2a), between x, and z, for that probability, from which the 
estimated curves of constant probability, equation (2), may easily be derived. 

The procedure for fitting a probit plane has been described elsewhere (Finney, 1943, 1947, 
§31), and its chief features need no alteration for application to individual records. Pro- 
viding that a first approximation to the equation can be guessed, repeated cycles of com- 
putation will give values for the parameters which approach more and more closely to the 
maximum likelihood estimates. Care in the choice of the first approximation will reduce 
the number of cycles needed; a poor choice will delay the convergence, though it will not 
affect the ultimate result. Since only a single observation is available for each combination 
of x, and x,, every working probit is either a maximum or minimum value, according to 
whether or not the response occurs. When there is only one dose factor, in the fitting of a 
probit regression line to records of individuals, grouping of doses and treatment of the 
observations in a group as if they related to an average dose may reduce the labour of the 
early computing cycles, but, since it will tend to give an underestimate of the regression 
coefficient, the final cycle may need to use the detailed records. Bliss (1938) has given an 
example illustrating grouping of this kind. Grouping is less easily applied, however, when two 
or more dose factors have to be used, and, for the data under discussion, the individual records 
were used throughout except in the formation of the ‘irst approximation. 

In the standard form of probit analysis, with moderately large numbers of observations 
at each level of dose, a x is usually computed for testing the significance of discrepancies 
between the data and the fitted equation; this y? is numerically the same as would be obtained 
by calculation from expected and observed numbers of responses and non-responses for 
each dose. If there are few observaticas in any dose group, the expected number of responses 
or of non-responses (or of both) is likely to be small, and, as is well known, y* may then fail 
to follow the sampling distribution tabulated for that statistic. Data of the type under 
discussion here are extreme examples of this situation, the number of observations for each 
dose being reduced to unity, so that any disturbance of the x? distribution is likely to be 
encountered in its most acute form. No complete theoretical investigation of this matter 
has yet been made, but the practical implications are discussed more fully in § 5. 

On the assumption that the estimate of equation (4) is an adequate representation of the 
data, lines of constant response probability may be obtained for any specified probability; 
these may be plotted according to equation (2) ona V, R scale. Standard statistical processes 
also enable fiducial limits to be assigned to the position of any of these curves. The difficulty 
of dealing with the estimation of error for individual records, and the inadequacy of the 
data for any sensitive test of whether equation (2) is a satisfactory representation of the 








syste 
thele 
at le 
extel 


In tl 
in de 
whic 
toge 
pute 
obta 
the 

alte’ 


As] 
forr 
vali 
fro1 








D. J. Finney 325 


system of curves, throw doubts on the exact interpretation of these fiducia! limits. Never- 
theless, they give some idea of the confidence that can be attached to the estimated curves, 
at least for moderate values of V and R: for extremes of either measurement, far more 
extensive data would be needed before much faith could be placed in the fitted equation. 


4. COMPUTATIONS FOR ESTIMATING THE THREE-PARAMETER EQUATION 

In this and the two succeeding sections, the computations for Gilliatt’s data will be described 
in detail. The first five columns of Table 1 show the thirty-nine pairs of values of V and R 
which occurred in the experiments, followed by the corresponding values of z, and 2,, 
together with a statement of whether or not the subject responded. Before the probit com- 
putations could be initiated, a first approximation to equation (4) was needed; this was 
obtained with the aid of the suggestion, from the plotting of the data shown in Fig. 1, that 
the constant probability curves were approximately the hyperbolae of equation (1), or 
alternatively 2, +2 = constant. 


As Bliss (1938) has pointed out, there is no objection to the use of overlapping groups in the 
formation of the first approximation. The data were therefore grouped according to the 


value of (x, + 22), as shown below, and the proportion of responses in each group was obtained 
from Table 1: 








Proportion Probit First 
+2, Responses lp) or p approximation 
1-5-1-9 0/7 0-00 — 3-3 
1-6-2-0 0/7 0-00 -- 3-6 
1-7-2-1 2/7 0-29 4-4 3-8 
1-8-2-2 2/9 0-22 4-2 4-2 
1-9-2-3 3/14 0-21 4-2 45 
2-0-2-4 8/19 0-42 48 4:8 
2-1-2-5 13/24 0-54 51 5-1 
2-2-2-6 17/25 0-68 5-5 5-4 
2-3-2-7 16/17 0-94 6-6 5-7 
2-4-2-8 12/12 1-00 — 6-0 























Each proportion was regarded as an estimate for the median value of (x, + 2,) in the group, 
ie. 1-7, 1-8, 1-9, ..., and its probit was read from one of the standard tables (Finney, 1947, 
Table I; Fisher & Yates, 1947, Table IX). As may be seen above, these probits were fairly 
well fitted by the guessed equation 


Y = —1-8+3(%,+2,), (5) 
which was therefore used as a first approximation to equation (4). 

A first set of expected probits was calculated from equation (5), and inserted as Y in an 
earlier version of Table 1. A cycle of routine probit calculations, just as described in the next 
two paragraphs, then led to an improved approximation to the required estimate, on which 
a second cycle of improvement was based. The figures shown in Table 1 relate to the fourth 
of these cycles, based upon the approximation 

Y = —9-127 + 6-666, + 5-906, (6) 


from the third cycle. Equation (6) is very different from equation (5), suggesting that more 
care might have been given to the selection of a first approximation; that the grouping 





326 Individual records of dose and quantal response 


adopted would lead to underestimation of the regression coefficients was expected, but 
insufficient allowance for this was made. Of course the ‘improvement’ in the approximations 
refets to their approach to the solution of the maximum likelihood equations, and is not 
necessarily always an approach tc the true relationship. 

The column of expected probits, Y, in Table 1 was calculated by substitution of pairs of 
values x,, 2, in equation (6); one decimal place here is quite sufficient. The weighting coeffi- 
cient, w, for each observation was then read from tables (Finney, 1947, Table IT; Fisher & 
Yates, 1947, Table XI) and entered in its column. The working probit, y, takes a maximum 
value for every observation giving a response and a minimum value for every observation 
giving no response, since these give empirical rates of 100 % and zero respectively; values 
of y were read directly from Finney’s table (1947, Table III; or, less simply for the minimum 
values, from Fisher & Yates, 1947, Table XI). The numbers of decimal places shown for the 
entries in Table 1 are sufficient for data of this type; indeed possibly one decimal for w and 
for y would be enough. Columns wz,, wx,, and wy were then filled, and the weighted sums of 
squares and products of deviations, required for the calculation of the regression of y on 
x, and x2, were completed at the bottom of the table. 

The equations giving the estimates of the regression coefficients, b, and b,, are 

0-4945286, — 0-382729b, = 1-032130, 
— 0-382729b, + 0-517714b, = 0-516978. 
Later calculations use the variances and covariance of 5, and b,; the equations were therefore 


solved by first obtaining the matrix inverse to that formed by the coefficients of b, and b, 
(Finney, 1943, 1947, §31; Fisher, 1946, § 29). This matrix is 


—_ (c *) - pbc ae. (7) 
Vig Veg) \3°498883 4514482 
the accuracy of the data is insufficient to need the number of decimal figures shown here, 
but their retention assists the checking and maintains the internal consistency of the 
analysis. Now b, = 1-032130v,, + 0°5169780,, 
= 668426, 
and similarly b, = 5-94003. 
The estimate of equation (4) is then 
Y = 9+6,(x, —%,) + b,(x,—2,) 
or Y = —9-182 + 6-6843z, + 5-9400z,, (8) 


a result which differs little from equation (6) and may be regarded as a sufficiently cone 
approximation to the maximum likelihood estimate. Since 


b,/b, = 0-889, (9) 
equation (8) may be transformed to give 
V R889 — constant (10) 


as the relationship estimated to éxist between V and R for a specified probability; the value 
of the constant can be obtained by substitution of the probit of the probability in equation 
(8), a process which gives 1-10, 1-36, 1-71, 2-16 and 2-66 for probabilities of 10, 25, 50, 75 


and 90 % respectively. Typical contours have been drawn in Fig. 1 so as to indicate the form 
of the relationship. 








When 
the we 
from t 
reduc 
of dev 


Wher 
dicat 
unrel 
cause 
(11) i 
culat 
grou} 
stud; 
distr 








D. J. FINNEY 327 


5. GOODNESS OF FIT 


When probit analysis is applied to data containing many observations in each dose group, 
the weighted sum of squares of deviations between the empirical probits and the predictions 
from the fitted equations is a x*, with degrees of freedom equal to the number of dose groups 
reduced by the number of fitted parameters. If S,,,, is written for the weighted sum of products 
of deviations of variates u and v, application of this method here would give 


Xise) oe 8,,- by S4- b.S8,.y 
= 40-045 — 6-6843 x 1-0321 — 5-9400 x 0-5170 
= 30-08. (11) 


When the dose groups are small, however, the x? so calculated cannot be trusted as an in- 
dicator of the significance of deviations from the fitted equation, and it is presumably most 
unreliable when each group is reduced to a single observation. Apart from slight discrepancies 
caused by imperfect approximation to the maximum likelihood solution, the x? in equation 
(11) is algebraically identical with that which would be derived, by the usual form of cal- 
culations, from comparison of observed numbers responding and not responding in each 
group with expectations computed from the fitted equation. As is well known from the 
study of contingency tables, when the expectations in some classes are small the sampling 
distribution of such a x? may be very different from that shown in the standard tables 
(Finney, 1947, Table VI; Fisher & Yates, 1947, Table IV); with data from individual records, 
no class can have an expectation greater than unity, and for many the expectation will be 
very much less, so that the discrepancy from the tabulated x? distribution is likely to be 
serious. 

The general effect of small expectations on the random sampling distribution of x? appears 
to be that the mean value remains about equal to the number of degrees of freedom, but that 
the variance in repeated sampling is increased. Consequently, samples from a population 
according with the null hypothesis are likely to show an excess of very high and very low 
values, as judged by the tables of y*. Thus there is little danger that significant evidence of 
deviations from expectation will be overlooked in an uncritical application of the test, though 
apparently significant values of x? need to be examined with care before they are regarded 
as evidence sufficient to justify rejection of the null hypothesis. Low values, as in Gilliatt’s 
data 30 with 36 degrees of freedom, need cause little alarm, for they clearly indicate no 
serious deviation from expectation. High values may in the first instance be compared 
with the standard tables of the y? distribution: if they fall beyond the significance level, a 
closer examination should be made before judging the null hypothesis to be untenable, for 
the apparent significance may be due to large contributions from one or two aberrant points. 
Gilliatt’s data provide an illustration of this. The expected probits for each pair of values 
of x, and 2, in Table 1 have been calculated from equation (8), and ins probabilities, 
P (= 1-Q), corresponding to these have been entered in the last column of the table; 
P is then the expectation of the number of responses for each dose. The y* obtained from the 
observed and expected numbers in seventy-eight classes is easily seen to be the sum of Q/P 
for all doses giving a response, plus P/Q for all giving no response. Inspection of the column 
for P shows small contributions to x2 everywhere, except for two instances of responses with 
probabilities of only 0-098 and 0-128, contributing 9 and 7 respectively; clearly the occurrence 


of these two responses as the most extreme events in thirty-nine trials need not be regarded 


Biometrika 34 22 








328 Individual records of dose and quantal response 


as serious evidence against the null hypothesis. The result of calculating x? by this more 
laborious process is a total of 30-3, which agrees closely with that already given in equa- 
tion (11). 

One method of modifying a x? test so as to remove its extreme sensitivity to deviations 
from small expectations is to combine expected and observed frequencies over several 
adjacent groups, so as to obtain groups with larger expectations; the number of degrees of 
freedom is then taken as the number of remaining groups less the number of fitted para- 
meters. Of course the groups must be chosen objectively, and without regard to the agree- 
ment between the frequencies. The statistic still will not follow the y? distribution exactly, 
but the approximation should be fairly satisfactory under the usual restriction that the 
groups be so chosen that none of the expected frequencies is small. This procedure often has 
to be adopted in probit analysis because of small expectations at very low or very high doses 
(Finney, 1947, §18). With individual records, however, only very extensive grouping will 
give expectations sufficiently large for the x? test to be trusted; the reduction of a large x* 
to a value below the significance level might then appear indicative of an insensitive test 
rather than of absence of serious discrepancies.* 

Probably no completely satisfactory solution of the difficulty is to be expected. In dividual 
records usually arise from experimental work in which the obtaining of large nu: vers of 
observations presents considerable difficulty. Often the whole series will consist of less than 
fifty observations, and, unless previous information enables the range of doses to be chosen 
satisfactorily, many of the observations will be made at doses for which response is either 
almost certain or almost impossible. Even if the individual dose-tolerances could be measured 
directly, a test of normality of their distribution (which is what the x* test attempts to 
provide) could not be very sensitive when based on only fifty measurements; if, instead, 
only quantal data are available, indicating merely whether a dose is below or above the 
tolerance value, a sensitive normality test is still less likely to exist (Finney, 1947, § 43). 

Gilliatt’s data, a series of only thirty-nine observations, provide an extreme instance of 
the difficulty of formulating a sensitive test of goodness of fit. Nevertheless, an attempt has 
been made to examine the discrepancies between the observations and the null hypothesis 
expressed by equation (4). In Table 2 are compared the observed and expected frequencies 
when the data are grouped according to the value of VR®**®, This is equivalent to a grouping 
based on the value of Y, the expected probit in equation (8), and, as this quantity had been 
evaluated for each observation in order to give P, it was used in the construction of Table 2. 
Since three parameters have been estimated from the data, four groups is the least number 
for giving a x” test. The limits of the groups were chosen so as to give similar numbers of 
observations in each. Inspection of Table 2 shows that the groups are still too small for a x? 
test to be trusted, thus suggesting that the data are inadequate for any useful test of goodness 
of fit to be made. The only anomaly in Table 2 is the occurrence of two responses where the 
expectation is 0-3, and this is clearly insufficient to cause much worry. 

The inadequacy of the data for detecting any differences in sensitivity between the three 


subjects may be seen from Table 3. The first nine entries in Table 1 relate to D.W., and sum- - 


* In his discussion of the analysis of individual records, Bliss (1938) suggests adjustment of the x? 
test, not by altering the calculation of the statistic but by reducing the number of degrees of freedom 
allotted to it; he gives an empirical rule for the reduction, based upon the expectations in terminal dose 
groups. This method, however, not onlylacks any theoretical basis, but seems liable to have an effect 


opposite to that which is needed; it will attribute significance to high values of ‘y* even more readily 
than will the unadjusted test. 








Th 
as 
an 


wt 
pr 
ea 
as 
la 





Se wwe. CCSD 


/-— owe & 





D. J. Finnerty 329 


mation of the values of P gives the expected number of responses for this subject; similarly 
the next eight and the last twenty-two entries give the numbers for V.P.W. and 8.J.S. 
respectively. Inspection of Table 1 shows that the tests on each subject were fairly widely 
distributed over the range of values of x, and z,. Table 3 shows excellent agreement between 
totals of observed and expected responses for each subject, thus suggesting that any in- 
dividual differences that exist are small by comparison with the variation in sensitivity of 
the same subject in different tests. 


Table 2. Comparison of observed and expected frequencies of response 












































Frequencies of results 
Range of Y 
Observed Total ugenet 
- > 
- + 
4 8 2 10 9-72 0-28 
4-5 6 0 6 3-92 2-08 
5-6 5 8 13 4-26 8-74 
6- 0 10 10 0-59 9-41 
Total 19 20 39 18-49 20-51 
Table 3. Comparison of subjects 
Frequencies of results 
Subject 
Observed Expected 
es + Total ae + 
D.W. 3 6 9 4-0 5-0 
V.P.W. 4 4 8 3-5 4-5 
8.3.8 12 10 22 11-0 11-0 
Total 19 20 39 18-5 20-5 




















6. LIMITS OF ERROR 


The variances of 6, and 6, and the covariance between them are respectively v,,, Vg, and v5 
as defined in equation (7). Hence the variance of Y, the expected probit corresponding to 
any pair of values 2,, 79, is 


V(Y) = * + U43(%, — %)* + 2v,9(% — %,) (gy —Z_) + Vgq(%_—%,)*, (12) 


where Sw is the sum of the w column in Table 1. All these variances are derived from binomial 
probability distributions. In the usual form of probit analysis, with a batch of subjects at 
each dose, the precision of the estimated relationship between dose and response is discussed 
as though the variation were normal, an assumption which is justifiable on account of the 
large numbers of individuals involved. Here, with only thirty-nine observations in all, the 


22-2 











330 Individual records of dose and quantal response 


assumption is less safe, but may be adopted for lack of any more trustworthy method of 
dealing with the data. It is unlikely to be seriously misleading, except possibly for extreme 
levels of the response probability, P. 

Equation (12) may now be used in the assignment of fiducial limits to any one of the curves 
of equal probability given by equation (10). For suppose that ¢ is the normal deviate corre- 
sponding to the significance level to be used in defining the fiducial limits, and that Y, is the 
probit of a probability P,. Then for any values of x,, x, for which 


(Y-%)?>#V(Y), 


4-077 


3-0 


2-o= 


Rate of inspiration (litres ‘per sec.) 











Volume of inspiration (litres) 


Fig. 2. Fiducial limits (5 % probability) to 0-5 frequency contour of Fig. 1. 
O no vaso-constriction; @ vaso-constriction. 


where Y is determined from equation (8), the expected probit differs significantly from Y, 
and for values of x,, x, which reverse the inequality the difference is not significant. Therefore 


the equation (Y-Y,)* = #V(Y) (13) 


gives the limiting values of (x,, x,) for which the null hypothesis that the true expected probit 
is Y, is not untenable in the light of the data; in other words, equation (13) defines curves in 
the (x,,2,) plane which are fiducial limits to the estimated locus of points having a constant 
response probability P,. These curves are clearly hyperbolae. In Figs. 2 and 3, the 5% 
fiducial limit curves (¢ = 1-960) for P, = 0-5 and P, = 0-9 respectively have been plotted in 





th 


is 


si 


arnt zs aft 


1 ies 
fore 
(13) 
obit 
e in 
tant 
5%, 
d in 





D. J. FrInnry 331 


the (V, R) plane; details of the calculation need not be given here, but Fig. 2, for example, 
is derived from the equation 


(14-182 — 6-68432, — 5-9400x,)? = 3-841 Ee: + 4-7261 (2, — 1-0495)? 
+ 6-9878(a, — 1-0495) (a, — 1-2483) + 4-5145(a, — 1-2483)"| : 


The pairs of curves are like hyperbolae in form. That for PR, = 0-5 defines a band on either 
sidé of the estimated relationship which is quite narrow for moderate values of V and R 


as | 


3-0 


2-0 


Rate of inspiration (litres per sec.) 








I l ] l all 
0 1-0 2-0 30 





Volume of inspiration (litres) 


Fig. 3. Fiducial limits (5 % probability) to 0-9 frequency contour of Fig. 1. 
O no vaso-constriction; @ Vaso-constriction. 


though naturally it widens considerably at the extremes. That for P, = 0-9, as might be 
expected from general consideration of the problem, allows much greater uncertainty on the 
side of high values of V and R; similarly, for P, = 0-1, that band would be relatively wider 
on the side of low values of V or R. 

The curves shown in Figs. 1, 2 and 3 may be regarded as plane sections, for selected values 
of Y, of a three-dimensional diagram relating Y to V and R. In terms of x, x, instead of 
V, R, this diagram is the three-dimensional analogue of the familiar diagram showing a 
regression line with hyperbolic curves indicating limits of error on either side; the line 








332 Individual records of dose and quantal response 


generalizes to a plane, and the limits are now defined by two sheets of a hyperboloid, one 
above and one below the plane. 

The theoretical basis of the curves illustrated in Figs. 2 and 3 is perhaps insecure, but 
undoubtedly they give a useful indication of the derendence of the probability of response 
on V and R and of the reliability of the estimation of this relationship. Much as an experi- 
menter might wish for a.more precise assessment of the effects of V and R, experience sug- 
gests that results such as those obtained here are as good as can be expected from a total of 
thirty-nine quantal observations. 


7. THE TWO-PARAMETER EQUATION 
In §3, the equation VR = constant (1) 
was suggested as an expression of the curves of constant response probability, but the more 
complex equation (2) was adopted for use in §§4~6. There are no theoretical reasons for 
believing that equation (1) represents the true form of the relationship, and the more general 
form was chosen in order that the complete calculations might be illustrated. The values of 


b, and 6, obtained, however, do not differ very greatly by comparison with the standard 
error of their difference; in fact 


V (by — bg) = V4, — 2049+ Vag 
= 2-253, 
and therefore b, —b, = 0-744 + 1-501. 
In the absence of any significant difference between the regression coefficients, the common 
scientific procedure of preferring the simpler hypothesis (Occam’s Razor) suggests that 
equation (4) might be replaced by 
Y = a+£(x,+2,). (14) 
For the estimation of equation (14), the computations are similar to, but shorter than, those 
of §4, since (x, + 7.) may be replaced by a single variate, x, and a simple regression calculated; 


the calculations in §4 were used to give a first set of expected probits, from which was 
derived the estimate 


Y = —9-475 + 6-4067(x, +24). (15) 
Only two parameters have been estimated from the data, and calculation as for equation 
(11) gives X¥en = 28-76, 


The difference between the two x? values may be taken as a further criterion of whether or 
not the extra parameter is needed, closely related to the test of significance of (b, — b,); 


xi = 1-32 
is not significant, though again the validity of the x? test is in doubt. 

Substitution of the probit of a specified probability in equation (15) gives the value of the 
constant in equation (1). For the 50 % response probability, for example, the constant is 
1-82; over the range of values tested, the curves 

VR°*®? = }-71 and VR = 1-82 
differ only slightly. Similarly, fiducial limits to (z,+2,) may be calculated, for any Y;, as 
upper and lower values of the product VR. No special interest attaches to these calculations; 
the novelties due to the individual records are exactly as for the three-parameter equation 
discussed in earlier sections, and otherwise the method is entirely that of ordinary probit 
analysis (Finney, 1947, Chapter 4). For comparison with the three-parameter equation, 
diagrams similar to Figs. 2 and 3 may be prepared; both the constant probability curves and 





ore 
for 
ral 
} of 
urd 


ror 


the 
t is 


, as 
ns; 
jion 
bit 
ion, 
and 





D. J. FInnry 333 


the fiducial limits are then true hyperbolae. Fig. 4 shows the results for a 50 % response 
probability, and is to be compared with Fig. 2. The constant probability curve in Fig. 4 
differs little from that in Fig. 2, though naturally the difference increases for large values of 
V or R where the curves are less well determined. For moderate values of V and R, the fiducial 








fi 
3-0 
3 
g 
g 
nm 
2 
= 
§ 2-0 
a 
Ss 
A 
a 
Q 
& 
*% bro 
° 
8 
3 
ee 
1-0- 
l l 1 s l i = | 
0 1:0 2°0 +0 





Volume of inspiration (litres) 

Fig. 4. Contour of dose-response surface for 0-5 frequency of response, estimated from two-parameter 
equation, and its 5 % fiducial limits (compare Fig. 2). Q no vaso-constriction; @ vaso-constriction. 
limit curves are practically the same as the corresponding curves in Fig. 2, but for more 
extreme values they lie much closer to the curve of constant probability; since the data 
show no significant difference between b, and bg, it is to be expected that a more precisely 
estimated relationship between stimulus and response will be obtained if an assumption that 
£, = f, is made, so that the information on the two regression coefficients can be combined, 

and this shows itself by narrowing the zone of error for the constant probability curve. 


8. SUMMARY 
The method of probit analysis has. been developed to assist the study of the relationship 
between the magnitude of a stimulus and the proportion of tests in which a particular 
quantal response to that stimulus appears. In some research problems, the stimulus cannot 
be controlled sufficiently to make possible the administration of a specified magnitude, 
though the stimulus actually received by any one subject can later be measured. It will 
then seldom happen that two subjects receive exactly the same ‘dose’, and the data for 








334 Individual records of dose and quantal response 
statistical analysis will generally consist of a series of doses with, for each, a statement of 
whether or not a single subject showed the characteristic response. 

Even for data of this type, the probit transformation can aid the estimation of the relation- 
ship between dose and the probability of response. The calculations leading to the estimate 
are more tedious than is usual in probit analysis, because of slow convergence from a pro- 
visional equation to the final form, but follow the usual pattern. The validity of the x? test 
of goodness of fit (in reality a test for the normality of distribution of individual tolerances) 
must be doubted, however, since the disturbance due to small class numbers will be en- 
countered in its most extreme form. Extensive grouping of results for adjacent doses will 
provide a test less open to objection, though this will generally be insensitive to all but the 
grossest deviations from normality; indeed, no valid sensitive test is to be expected with 
individual records unless these are very numerous. 

In this paper, the calculations have been illustrated on data relating to a reflex vaso- 
constriction which sometimes occurs in the skin of the digits of human subjects after a single 
deep breath. The relationship between the occurrence of this response and two dose factors, 
the volume and the rate of inspiration, has been estimated for the combined records from 
three subjects; inclusion of two dose factors complicates the analysis, since a bi-variate 
regression equation must be fitted, but does not affect the underlying theory. The x? test 
has been discussed at length, though there is no indication of non-normality or of hetero- 
geneity of the data. The reliability with which the dependence of the probability of response 
on the dose factors is estimated has also been examined, and curves bounding fiducial 
regions, within which the true probability contours may confidently be asserted to lie, have 
been determined. This method of representing the limits of error is applicable to other forms 


of probit analysis involving two dose factors and is not restricted to individual records, 
though it has not previously been described. 


I am indebted to Mr R. W. Gilliatt, of the Department of Physiology, both for permission 
to make use of his data in an illustration of the statistical methods of my paper and for 


assistance in describing his experimental procedure. My thanks are due also to Miss M. Callow, 
who prepared Figs. 1-4. 


REFERENCES 
Buss, C. I. (1934a). The method of probits. Science, 79, 38-9. 
Buss, C. I. (19346). The method of probits—a correction. Science, 79, 409-10. 
Buss, C. I. (1935a). The calculation of the dosage-mortality curve. Ann. Appl. Biol. 22, 134-67. 
Buss, C. I. (19356). The comparison of dosage-mortality data. Ann. Appl. Biol. 22, 307-33. 
Buss, C. I. (1938). The determination of dosage-mortality curves from small numbers. Quart. J. 
Pharm. 11, 192-216. 


Botton, B., CarmicHakEL, E. A. & Stirup, G. (1936). Vaso-constriction following deep inspiration. 
J. Physiol. 86, 83-94. 


Finney, D. J. (1943). The statistical treatment of toxicological data relating to more than one dosage 
factor. Ann. Appl. Biol. 30, 71-9. 

Finney, D. J. (1947). Probit Analysis: 
bridge: University Press. 

Finney, D. Jv (1948). The principles of biological assay. J.R. statist. Soc. Suppl. 9, 46-91. 

FisHEer, R. A. (1946). Statistical Methods for Research Workers, 10th ed. Edinburgh: Oliver and Boyd. 

FisHer, R. A. & Yates, F. (1947). Statistical Tables for Biological, Agricultural and Medical Research, 
3rd ed. Edinburgh: Oliver and Boyd. 

GappvuM, J. H. (1933). Reports on biological standards. III. Methods of biological assay depending 
on @ quantal response. Spec. Rep. Ser. Med. Res. Coun., Lond., no. 183. 

GituratTt, R. W. (1947). Vaso-constriction in the finger following deep inspiration. J. Physiol. (in the Press). 


A Statistical Treatmené oj the Sigmoid Response Curve. Cam- 














[ 335 ] 


A POWER FUNCTION FOR TESTS OF RANDOMNESS 
IN A SEQUENCE OF ALTERNATIVES 


By F. N. DAVID 


1. During recent years attention has been focused on what might be called the ‘group’ 
test for randomness in a sequence of alternatives. Thus, if Z denote the happening of an 
event, and Z its negation, the number of alternations of EF and Z in a sequence supposedly 
random has been chosen as a test criterion. This test has been put to different uses by 
W. L. Stevens (1939), A. Wald & J. Wolfowitz (1940) and F. N. David (1947). It seems worth 
while therefore to enquire what is the power of this test against a set of specifically defined 
alternate hypotheses. The hypothesis to be tested will be that there is randomness within 
the sequence, with the alternate hypothesis that if there is no randomness then there is 
dependence of the type found in a simple Markoff chain. The same procedure will hold good 
for dependence of the types found in more complex chains although in these cases the 
enumeration is a little troublesome. 


2. If there is a sequence of dependent events 
E,, E,, E,, :.., E,, 
then it is an elementary proposition of the probability calculus that 
P{E, E, EB, ... E,-;E,} = P{H,} P(E, | #,} P{H, | Z,H,}... P{E, | E, E, ... E,_,}. 
If the events are independent, then 
P{E, E,E, ... E,_,E,} = P{H,} P{E,} P{H;} ... P{H,}. 

This relation will be the basis of H,, the hypothesis to be tested. If there is dependence as 
in a simple Markoff chain, then mathematically each event will be dependent on the event 


immediately preceding it, but will be independent of any of the other events. In this case 
we shall have 


P{E, E, E,... E,_,E,} = P{E,} P{E, | E,} P{E; | £,}... P{Z, | E,}- 
This relation will be the basis of H,, the hypothesis alternate to 4). 

3. For the hypothesis, Hj, let the probability that an event Z will occur in a single trial 
be p, and let the probability of # (the negation of EZ), be g, where p+q = 1. The probability 
of obtaining any given sequence of r, E’s and r, #’s will be 

p qs. 
The number of ways in which r, Z’s and r, H’s may be arranged to form 2¢ and 2¢+ 1 sets of 
E’s and E’s alternately is 


__&Mr— 1) (0)! ty try 2 
fa =GrE—D!(,—Dl irae! BMG Sana = Sux“ —- 





Writing k = 2t or 2¢+ 1 ag desired, the probability of obtaining a sequence of r, H’s and r, £’s 
arranged in k sets is piqrf, he 


P{k | 1,, 12, Hy} = ae a : 
Berk Bh 


k may take values 2, 3, ..., 2r,, ifr, = rg, and values 2, 3, ..., 27, + Lifr,>r9. 











336 A power function for tests of randomness in a sequence of alternatives 


4. Following the orthodox procedure, in order to test the hypothesis, H,, it is necessary 
to find two numbers k, and k, such that 


P{k<k,| Hy}<je, P{k>k,| Hy} <te, 

and therefore P{k, <k<k.}>1—e, 
where ¢ is a number arbitrarily at choice. If an observed number of sets, say k’, falls outside 
the limits £, and k, then the hypothesis H, will be rejected in favour of some alternate 
hypothesis, H,. Alternately if H, is not true, but H, is, then 

1— Pik, <k<k,| H} 
will be the power of the test in the sense of the word as used by Neyman & Pearson. Whether 
k, or k, is chosen to judge the significance of an observed k’ will depend on which departure 
from randomness it is most important not to overlook. If the alternate hypothesis is that 
there is positive dependence in the chain, i.e. that HZ having occurred in the sth trial it is 
more likely to occur in the (s+ 1)st trial, then £, would be chosen. Such a situation was 
envisaged in a proposed smooth test to supplement the x? criterion (David, 1947). If, however, 
the alternate hypothesis is that there is negative dependence, i.e. that EZ having occurred in 
the sth trial, it is less likely to occur in the (s + 1)st trial, then k, would be the appropriate 
criterion. If it is immaterial whether the departure from randomness is positive or negative 
dependence, then both &, and k, may be used. 


5. We now consider the alternate hypothesis, H,. Write HZ, for the occurrence of the 
event £ in the sth trial and £, for its negation. Let 
P{E,} = P, P{E}}=Q, P+Q=l1and P>Q, 
P{E, | E,_4} = Fis P{E, | E,_4} = %: 
P{E,|£,43=p2, P{E, | £.1} = qe 
Thus p, and g, are probabilities of no change and p, and q, probabilities of a change. If the 
events are independent then 
PA =P,=P and 4=G=Q. 
6. In calculating the probability of obtaining any given sequence, what will matter will 


be the number of changes from E to # and back again. Let f,(r,) be the number of ways in 
which r, E’s can be arranged in ¢ groups, i.e. let 


(r,-1)! 


He) = ayer 
If there are 2¢ groups in a sequence of r, E’s and r, Z’s, the number of ways of obtaining such 
a sequence will be Silty) fre) 


if the sequence starts with Z or with £. The probability of obtaining any given sequence of 
r, E’s and r,Z’s of 2t groups will be 
Pry peggy! or Qa tae ppp. 
This follows from the fact that a sequence of 2¢ groups beginning with E will imply ¢ changes 
from £ to H andt—1 changes from E to E. The changes are reversed in number if the sequence 
starts with #. For 2¢+1 groups the number of ways of obtaining the sequence will be 
Staats) file) or fila) Sess(e) 


according as the sequence begins with E or LZ. The respective probabilities will be 


Ppp nage! and Qgige ppp. 








The 
the! 


Th 


sary 


side 
late 


her 
ure 
hat 
t is 
vas 
rer, 
lin 
ate 
ive 


the 


he 








F. N. Davin 337 


The probability therefore of obtaining a sequence of r, E’s and r,H’s in 2¢ groups will be 
therefore, under hypothesis H,, 


: a 
(2281) serv fire) (= +2) 
P{2t | r,7,H,} = = mal we 0\P 2h d , 
2 Pfs) ES file > ®) +5 fealta) Filtre) + qt) fistr) | 
The probability of obtaining r, Z’s and r, #’s in 2¢+ i groups will be similarly 


\é P 

(Fee) pfs +H) fot 
P{2t+1|r,7,H,} = = —- pit’ Le : a , ; 
2 wae [ tee fera(5 +€) + fesalta) fre) +o Ftd A: sates) | 








7. So far no mention has been made of any possible‘connexion between 7,, ¢,, P2 and qo. 
It is obvious in all cases we shall have 


Ath=1, Peta =1, 
but the connexion between p, and p, is not immediate. We shall make the simplifying assump- 
tion which is perhaps most closely related to practical problems, and shall state that where 
nothing is known about the s—1 trials preceding the sth trial, P{Z,} = P and P{E£} = Q. 
Under this assumption we have 


This result is reached easily by noticing that 
P{E,} = P{E,E,_s}+ P{E,E,_} in P{E,_,} P{E, | E, + P{E,_,} P{E, | E._4} 
whence P = Pp, + Qp,- 


8. The alternative hypothesis chosen to illustrate the power function formulae is that 
there is positive dependence in the sequence, i.e. k, is found so that 


P{k<k,|Hj}<e and 1-—P{k>k,|H} 


is calculated, when p, > P. For economy of drawing, several power curves or what are really 
sections of a kind of power surface, plotted to coordinates P, p,, have been put together in 
the diagrams of Fig. 1. For example the bottom left-hand diagram shows for r, = r, = 10 
sections of the conditional power surface for P = 0-5, 0-6 and 0-75. When H, is true and 
P = p,, we have the 5 % risk of rejecting H, wrongly. As p,—P increases the chance of 
detecting the f. st increases, but in a way dependent on P. The other three diagrams show 
similar sections of the surfaces with r, = r, = 5, with r, = 14,r, = 6and with r, = 7,r, = 3. 
In practice it will not be known what the value of P is, but the curves show reasonably well 
how the power of the test varies as P and p, (and therefore p,) vary. It is clear that the test 
for randomness under discussion is most powerful when the numbers of alternates are equal, 
i.e. when r, = rz. The power declines sharply when r, increases at the expense of ry. 
Another point which emerges is that the test is only moderately powerful, against the given 
alternate hypothesis tested, when r,+r, = 20, and it would appear therefore that if it 
was desired not te overlook a possible departure.from randomness in the form of positive 
dependence in the chain, then the length of the sequence should consist of at least 20 units. 
The question of other possible tests we shall not discuss at this stage. 








Power of test 


Power of test 


Sequence of 20, r, = 14, r, = 6 


1-0 





0-9 





iT | 




















0-2 





0-1 


/ | 
/ 











0-0 








-—-——| 








--~4-P{k|Hy} 











05 06 0-7 0°38 0:9 10 
Scale for P and p, 


Sequence of 20, r, = r, = 10 





wz 














°o 
wn 








| 
[ 





] 
f 





[ 











xe op a 








an 





‘ee - P{k| Ho} 











Scale for P and p, 


0°5 0-6 0-7 0-8 0-9 1-0 





Sequence of 10, r, = 7, r, = 3 





1-0 








0-9 


0-8 





Ul 





| 





I] 








i] 
TV 





LA 


/ 


0-2 





——— | 




















= —— —| 





0-1 


culiediedion lenin 





0-0 





0-5 


0-6 0-7 0-8 0-9 


Scale for P and p, 


1-0 


Sequence of 10, 7, = 7, = 5 





1-0 





0-9 











06 














LZ 


Y/ 


0-2 





---2 








“i 


7 





rae | 











0-1 


0-0 





0-5 0*6 0-7 O-8 0-9 


Scale for P and p, 





1-0 


338 A power function for tests of randomness in a sequence of alternatives 


Power of test 


P{k| Hy} 


Power of test 


P{k|Ho} 


Fig. 1. Conditional power curves when the alternate hypothesis is positive dependence. 








F. N. Davip 339 


9. It will be noticed that P{2t or 2¢+1|1r,r,H,} which have been loosely termed power 
function formulae are not power functions in the sense originally defined by Neyman & 
Pearson, but they appear to involve a justifiable extension of that idea. In order to dis- 
tinguish them from the usual meaning of the words power function, I shall refer to them as 
conditional power functions. The theory of the conditional power function may be stated 
briefly in the following way. It is assumed that all possible samples (or sequences) may be 
classified according to their composition. Suppose that there are k of these mutually exclu- 
sive classes, which are also the only possible, say C,, C3, ...,C;,,. We have considered only the 
case where k is finite but it appears likely that the method can be extended to cover the case 
where k is enumerably infinite. These classes, C,, C,, ..., C, will correspond to regions forming 
a partition of the sample space. 

Let H, be the hypothesis tested and w, be the critical region used for the rejection of this 
hypothesis. Given that a sample is in C; (say), and that an alternate hypothesis H, is true, 
then the probability that H, will be rejected is 

P{Eew,C, | EeC,, H,} = Tete 
where w,C; means the region common to w, and to C; and, following the Neyman-Pearson 
notation, Z is the sample point. Regarded as a function of H, this is the conditional power 
function of the test associated with w, in the subset C; of samples. 


The Neyman-Pearson power function, which we might call here the overall power function, 
i k ‘ k 
= = 
which may be looked on as a weighted average of the conditional power functions. 


10. There seems to be no reason why w, should not be built up of portions w,C;, these 
portions being chosen to maximize each term of the summation, i.e. w,C;, chosen to maximize 
the conditional power function. For example, to revert to the specific case of randomness 
within a sequence with which we have been dealing, the different partitions of r (= r, +1.) are 
the mutually exclusive and only possible classes C,. It is conceivable, although practically 
not very likely, that for each of these classes there will exist a different test which is more 
powerful to detect specifically defined departures from the basic hypothesis tested than 
any other test. The decision as to which is the most powerful test, against the same specifically 
defined alternatives, to use for any given class. will be decided by the conditional power 
function. Once this has been decided the procedure for the complete test of significance may 
be laid down. This will be: (i) count the number of alternatives in the sequence, i.e. find r, 
and fr, (ii) from (i) decide the appropriate test of significance to use, (iii) apply the test. The 
power of the test as laid down by (i), (ii) and (iii), in the usual meaning of the word, will be 
given by the overall power function. 

It is proposed to discuss these, and other applications of the conditional power function 
technique, in a further publication. I have been concerned here with trying to explain what 
I believe to be the basic ideas, and to forestall possible criticism that I am falling into error 
(of the third kind) and am choosing the test talsely to suit the significance of the sample. 


REFERENCES 


Stevens, W. L. (1939). Ann. Eugen., Lond., 9, 11. 
Wanp, A. & Wotrowrrz, J. (1940). Ann. Math. Statist. 11, 147. 
Davin, F. N. (1947). Biometrika, 34, 299. 











[ 340 ] 


A NUMERICAL SOLUTION OF THE PROBLEM OF MOMENTS 
By H. 0. HARTLEY anv 8. H. KHAMIS 


1. InTRODUCTION 


Given a statistical variable x and its frequency distribution f(x), then, under certain con- 
tinuity conditions for f(x), the moments 


by = farfte)de (ry = 0, 1, 2, ...) (1) 


can be evaluated for any integer r. For certain distributions f(z) the integrations in (1) can 
be carried out analytically resulting in simple formulae for the moments. In general there 
is no inherent difficulty in obtaining numerical values for the moments by numerical 
quadrature. : 

The inverse problem is to find the distribution f(x) given the moments y,. This problem, 
commonly known as ‘The Problem of Moments’, has received considerable attention by 
mathematicians and is of interest in statistical distribution theory. There are numerous 
statistics for which it is difficult to obtain a formula of the random sampling distribution 
f(x) amenable to numerical evaluation. On the other hand, in such cases it is often possible 
to find simple formulae for the random sampling moments (Bartlett, 1937). Sometimes such 
formulae are availabie for all integer x; more often than not, however, , is only known for 
a limited number of small r (e.g. r = 0, 1, ..:, 6). A simple method of ‘determining’ f(x) from 
the given moments would therefore be helpful in such cases. 

Examples of variables of this kind are the numerous moment statistics or k-statistics for 
which random sampling moments can be evaluated, notably by R. A. Fisher’s (1929, 1930) 
combinatorial methods, whilst their exact sampling distributions are usually unknown. 
As related statistics we should mentior heré the moment ratios ,/b, and b, used in tests 
for deviation from normality (Geary, 1947, Geary & Worlledge, 1947). For these, the 
low-order moments are known exactly. A similar situation arises with statistics defined as 
likelihood ratios, as, for instance, with the criterion L, required for testing heterogeneity 
in a set of variances. Moments for this statistic were obtained by Neyman & Pearson as early 
as 1931, yet, although approximations to f(L,) have been obtained (Bartlett, 1937; Hartley, 
1940; Nayer, 1936; Neyman & Pearson, 1931; Sukhatme, 1936; Welch, 1935, 1936), there 
is still considerable doubt about their-accuracy in certain cases, and the exact formula 
obtained by Nair (1936) in the case of equal sample sizes is very complex. 

These and numerous other problems of distribution point to the necessity of developing 
a numerical technique to deal with the following situation: 

(i) Arandom variable z ranging between a and b (where a may be — oo and b may be +00) 


has a distribution function f(x) known to have a continuous derivative of order n. 
(ii) The moments 


y= [ead (r = 0,1,..., R), (2) 


are known numerically to any decimal accuracy desired but for a limited number of positive 
integersr, viz.r = 0,1,..., R. With the knowledge about f(x) limited to the above conditions, 


is it possible to obtain numerical values for the probability integral P(x) = [. f(a) dx 
a 





mn- 


(2) 





H. O. Hartiey snp S. H. Kwamis 341 


depending on the moment: only, and is it possible to make a statement on the accuracy of 
these values in terms of the derivatives of the function f(x)? 

Problems of this kind have hitherto been treated principally in two ways: 

(a) When R = 2,3 or 4 nothing better can be expected than a ‘good fit’, which is often 
achieved by fitting the appropriate Pearson-type curve. 

(6) With R in the neighbourhood of 5-8, expansions of the Gram Charlier, Laguerre or 
Jacobi type have been used, either as cumulant or as moment expansions. Such theorems 
as are available for statements on the convergence and asymptotic behaviour of these 
expansions usually require too many moments to be known. Often the expansions are only 
asymptotic, and unless the distribution is close to the generating curve (Normal for Gram 
Charlier, [’ for Laguerre), the results are often disappointing (see, for example, Kendall, 
1945, Chapter 6). 


2. OUTLINE OF PRESENT METHOD 


The method to be developed here is a direct application of finite-difference calculus and 
therefore provides both numerical answers to the problem, as well as gauges of their accuracy 
in form of remainder terms. The method is, in fact, closely linked with interpolation technique. 
When using any of the well-known interpolation formulae no mathematically rigorous 
statement on the accuracy of the interpolates can be made unless the magnitude of the 
remainder term can be estimated, and for this some knowledge about (say) the nth derivative 
of the function is required. Yet, in using such formulae the convergence of the difference 
table inspires confidence that ‘the results of the interpolation can be accepted as a working 
hypothesis’ (Milne Thomson, 1933, p. 62). Similarly, with the present method we shall give 
a numerical procedure of obtaining values of the probability integral. Certain checks of 
internal consistency will be described which inspire confidence that the answers are correct, 
but no rigorous statement on the accuracy can, of course, be made if this is to be based on 
a finite number of moments alone. The exact remainder terms which-we derive will entail 
the high-order derivatives of f(x), and it is hoped, in a second communication, to derive 
some general statements concerning their order of magnitude. 

In order to simplify the argument we assume in this section that the range of z is finite 
(a and 6 finite). 


The aim is to determine the probability integral of x, P(x) = f(x) dz in tabular form, 


i.e. we wish to determine numerical values of 
Xi 
P, = P(x) = i) flee) da (3) 


for discrete values of x;. For convenience the group intervals z;,, — x; will generally be chosen 
equidistant (group interval = h), and the number of intervals will be R + 1, i.e. equal to the 
number of given moments (including 7) = 1). Hence 

a, =at+th, h= (b—a)/(R+1). (4) 
The first differences in the table derived from equation (3) are the quantities 


fh=P-Par=|— flx)dz, (5) 


t-1 
and are the familiar ‘frequencies’ f; in a grouped frequency distribution with equidistant 
intervals (see Fig. 1). The link between these frequencies and the exact moments #, is then 








342 A numerical solution of the problem of moments 


established by the well-known formulae for Sheppard’s correction. Using Kendall’s (1938, 
1945) derivation and remainder term, but extending his notation, we have 


R+1 
D feb = oe + Or, b) + S(r,h), (6) 
i= 
where the centre points £; are given by 
E,=a+(i-h)h (¢=1,...,R+)), (7) 
C(r,h) denotes Sheppard’s corrective term, viz. 
lar] (/h\2i / l 
C(r,h) = 7, (5) (sari My-2j> (8) 


and S(r,) the remainder term. 








PF fs Xe b=2, 
Fig. 1 

The aim, now, is to use equations (6) to determine the unknown f; from the given 4,. 
To this end the remainder S(r,h) must be examined: Most distribution functions have what 
is commonly known as high contact at the terminals of the variate range. This means 
that f(x), as well as all its derivatives up to order, say, m, vanish at both ends of the range, i.e. 


fa) = f%|b)=0 (&§ =0,...,m). (9) 
If for such functions we define f(z) = 0 outside the range a<a<b, it will have continuous 


derivatives of up to order m for —co<x%< +00. It can then be shown that the remainder 
term is of the form (see, for example, Kendall, 1945, p. 69) 





S,(r,h) = -FEO" pier, h,0,) (mm oven), (10) 
S(t) = AXA” por (4) (7, 6,) _(m odd), (1) 
a<6,<b, 


where the B, are the Bernoulli numbers, the BY are the Bernoulli polynomials of first order, 
the integrand function k(r, h, x) is defined by 


+h ; 
k(r,h,x) = wf"  fle+bds (12) 


and its derivatives with regard to x are denoted by K®. In the subsequent sections we shall 


assume (9) to hold (contact of order m), but will discuss the case when (9) is not satisfied 
in § 10. 


The remainder term S,,(r, h) will usually be small (see, for example, Kendall, 1945, p. 72). 


We shall therefore, in what follows, ignore S,,(r,) but will discuss the error thereby com- 
mitted in § 5. 


If, then, in (6) we omit S(r,h) we obtain a system of R+ 1 linear equations for the R+1 
unknowns f; 


R+1 
DLG = Met Crh) = Fy (13) 











~~" 





H. O. Hartiery anp §S. H. Kuamis 343 


The matrix of this system of equations (vz say) is of the form | £ | and has a classical deter- 
minant || £||, sometimes referred to as Vandermonde’s determinant and well known to be 
+0. The system can therefore be inverted once and for all and, for any particular case, the 
unknown f; can then be determined by substituting the right-hand sides of (13), i.e. 7, in 
the inverse matrix vz!. Denoting the elements of this inverse matrix by u,, we have the 
system of equatious R 

fe= % tu (14) 


j 
Progressive addition of the f; yields the P; from P; = > f,* and therefore a table of P(x) at 
i=1 


interval h: Finally, intermediate values of P(x) can be obtained by standard interpolation. 
Alternatively, as described in §7, we may obtain directly a table of P(x) at interval $h. 


3. THE STANDARD FORM OF THE NUMERICAL INVERSION 


The rank of the original matrix vz is obviously equal to R+ 1, i.e. the number of moments 
given, whilst its elements are the powers of the centre points £. It is desirable therefore that, 
for any given R, scale and location of the variable x be transformed into a standard form X, 
so that only one matrix V; and therefore only one matrix VR! need be calculated for each R. 
It is most convenient to standardize as follows: 


X= (e—Ya+b) pt? (R even), (15) 
i. (e—Ha+b)) pt +5 (Rodd). (16) 


It will be seen, therefore, that the range of X is R+ 1 and the group interval 
H = Xiu —-Xy = 1, 
From the given moments of x those of X (M, say) about X = 0 can, of course, be calculated 


by the usual binomial formulae, and in what follows we assume that values of M, are given 
numerically. Further, in analogy to (13), we have 


M, = M,+C(r, 1). (17) 
From (15) and (16) we obtain for the new centre points 
5, = —43R, ..., 0,...,+4R for even R, 


= 18 
Rw am a for odd R, (18) 
and the matrix V, becomes | (i— 1—}R)'| or | (¢—4R— 4)" |. Thus, if the first six moments 
are given, we obtain for Y: 








l 1 a oe ae | 1 
-$ -2 -1 01 23 3 
9 4 1.6.2 4 9 

Ve=| -27 -8 -1 01 8 2 (19) 
81 16 i-@ 1 a 
—93 —38 -1 0 1 38 2 
729 64 1 0 1 64 729 


* It is, of course, possible to construct a matrix yielding the P, directly from the p,, but we are here 
satisfied with determining the /; first, as they are of independent interest. 


Biometrika 34 23 





344 A numerical solution of the problem of moments 


In practice the important range of R will be from 5 to about 8. The inverse matrix Vz? is 
given below, and it is hoped to give V7", Vz* and Vz" in a subsequent paper. The inverse 
matrix V5", the elements of which are denoted by U;,, can be written in the form 


R ae 
Cif; =~ ~ U;,,M,, (20) 


where Uj, = c,U,,, i.e. the c; are suitable common denominators of the U,», 
Uj, are given in the body cf the schedule below: 


M,.=1 M, M, M, M, M, M,=multiplier of column 

cf, r=0 1 2 3 + 5 6 
720f, 0 —12 4 1 -5 -3 1 
Se ee ie oe le ee 
— 36 36 13 -13 -1 1 
36f, 36 0 —49 0 14 0 -il 
48f, 0 36. 36 -13 -13 1 1 
2. .-— os a oe oe fk 

7 T20;, 0 12 4 -15 -5 3 1 


..., Up, and the 


(21) 


onr WNW Ee @. 
& 
.” 
o 


In order to use the above system of equations it would be necessary to compute the M, from 
the given M,, using formula (17). It is obviously more convenient to evaluate, once and for 


all, a matrix U;, giving the f, directly in terms of the given M,. This matrix is given below 
for R = 6: 


Si 


i M,=1 M, M, M, M, M, M, 
1 f, 0000379 -—0-011719  0-002344 ~ 0-017361 -—0-005208 -0-004167 0-001 389 
2 fz —0-005227 0-109375 -0-034896 -—0-152778  0-072917  0-016667 —0-008333 
3 fs  0-059161 -—0-683594  0-618490  0-253472 —0-244792 —0-020833  0-020833 
4 f, 0891373 0 —1:171875 0 0-354167 0 — 0-027 778 
5 fs 0059161  0-683594  0-618490 -0-253472 -—0-244792  0-020833  0-020833 
6 fe —0-005227 -0-109375 -0-034896 0152778  0-072917 -—0-016667 —0-008333 
7 fy  0:000379  0-011719  0-002344 -—0-017361 -—0-005208  0-004167 0-001 389 
(22) 


Working rule: Each f; is obtained by forming the sum of seven products using the seven coefficients in 
the ith line and applying them to M,, .... Mg, e.g. f, = 0-000 379M, — 0-011 719M, + ... + 0-001 389M,. 


4, CALCULATION OF THE INCOMPLETE B-FUNCTION [,(8, 6) 
FROM ITS FIRST SIX MOMENTS 


As an example for the above method we consider the Beta Distribution for p = 8 and 
q = 6, viz. f(x) = [ B(8, 6)} 27(1 — 2). 

Using the moments for this distribution about z = 0, uw, = B(x+r, 6)/B(8, 6) (r = 0,..., 6) 
and transforming to the standard scale X = 7x— 3-5, we obtain for the moments of X about 


X = 0: M, = 0-5, M, = 1-05, M, = 1-225, M, = 2.77426, M, = 4-41360 and’ M, = 10-56942.. 


Substituting these in the matrix (22) we obtain values of f; whose progressive sums are 
shownin Table | (calculated (8, 6)). These may be compared with the ‘exact’ values obtained 
(by interpolation) from the T'ables of the Incomplete B-function (1934). The worst discrepancy 
is about 2 in the fourth decimal. Higher accuracy can, of course, be obtained if the number 
of moments (R+1) and therefore the number of f; increases (see, for example, §8, where 
the normal curve is obtained to 5-decimal accuracy). 














is 





H. O. Hartiey anp 8S. H. Kuamis 345 


A rather gratifying feature of the comparison is the higher decimal accuracy in the tails 
of the distribution. This is a consequence of the sensitivity of the higher moments to changes 
in the tail frequencies. Note also that the elements in the top and bottom lines of the inverse 
matrix (22) are much smaller than those in the other lines, so that any error in the right-hand 
sides of (13) has a smaller effect on the terminal f,. 


Table 1. Comparison of ‘calculated’ and ‘exact’ values of I,(8, 6) 








Eg x Exact I, Calculated I, Difference 10- 
—2-5 1/7 0-000 11 0-000 09 2 
—1-5 2/7 0-013 41 0-013 54 —13 
—0°5 3/7 0-140 17 0-139 95 22 

0-5 4/7 0-489 63 0-489 81 —18 
1-5 5/7 0-862 61 0-862 70 -— 9 
2-5 6/7 0-994 14 0-993 95 16 
3-5 7/7 1-000 00 1-000 00 0 























It might be argued that a further error will arise when determining intermediate values 
of I, by interpolation i in the ‘calculated’ table. This difficulty could, however, be overcome 
by shifting the grid of group intervals and using a standard X-scale with group end-points 
corresponding to the odd multiples of 1/14 in z, thereby obtaining [, at points half-way 
between the arguments of Table 1. Such a method has actually been used in § 7. 


5. THE REMAINDER TERM — 


A formal representation of the remainder term is immediately obtained by reverting to the 
exact equations (6). If we are concerned with distribution functions having contact of order 
m at the terminals, the error contributions to the f; ate obtained by substituting the R+1 
remainder terms S,,(r,) ((10), (11)) in the inverse matrix vu. It is convenient to use the 
standard variate X-scale, H = 1 and the V—! matrix when it will be found that 


error f; = ¥ VerSalrs 1), (23) 


where S,,(r, 1) is given by (10) or (11) putting h =,1 and remembering that the integrand 
function & must be taken in terms of the standard variate X, viz. 


kes 1,X) = °[" ¢(5—5 (X40) a5 a. (24) 


Since the arguments 6, of #™(r, 1, X) are unknown it will as a rule be necessary to substitute 
their respective maxima in (23), at the same time taking | U,,| in place of U,,. 

Although with (23) we have given a formal solution of the error term involved, in a manner 
similar to the remainder terms of interpolation formulae, it will in practice be difficult to 
estimate the magnitude of the error from this amend It is hoped, therefore, to go into this 
aspect more fully in a second paper. 


6. INFINITE VARIATE RANGE AND ARTIFICIAL TRUNCATION 


When the range of the variate is infinite, i.e. when a = —0o and/or b = +00, it is, of course, 
possible to transform the variate x by, say, y = y(z) such that the range of yis finite. However, 
in general, we shall not be able to assume that the moments of y are known or that they can 


23-2 














346 A numerical solution of the problem of moments 


be derived from those of x. It is therefore necessary to adapt our method to deal with an 
infinite variate range. We shall treat here the case b = +00, the case a = —0o being identical 
and the case a = —00 and 6 = +00 being analogous. 

For an infinite variate range, the condition of high contact is now xeplaced by 


lim f(z) =0 (¢ =0,1,...,m), (25) 


z<->@o 
which results in remainder terms analogous to (10) and (11)*. Similarly, in equations (6) 
which correspond to Kendall’s (1945) equations (3-40), the summation now extends from 


Ey 
i = 1 to i = 0, there being an infinity of frequencies f; = f(z)dx. Now since the yp, 


exist we know that co 65 
i) afl) da (26) 
is convergent. Accordingly : 
ro) th 
lim = G-aye[ fede =o, (27) 
b> i=R+2 (i-1)h 


ifh = (b—a)/(R+1). If, therefore, we denote the above sums by e(r, b) respectively we have, 
from (6), R+1 
2 Liter, b) = wp + C(r,h) + S(r,h). (28) 


Applying now the previous method we introduce an additional error in the calculation of f,, 
but this error is smaller than +max | ¢(r,b) | £ | w;,|. 


The precise determination of the e(r, b) for any given b would, of course, require a knowledge 
of the nature of the convergence in (26), i.e. some external knowledge about the distribution 
f(x) which we are seeking to determine numerically. Unfortunately, such knowledge will 
in general not be available. 

However, if b is chosen sufficiently large, the f, determined for different values of b should 
all yield, by the method of §§ 2 and 3, approximations to the same probability integral P(z) 
to within the errors of the respective remainder terms S(r,h) and to within the errors intro- 
duced by (27). In practice, therefore, one would make an intelligent guess at the likely 
range of b and then test for internal consistency by comparing the probability integral tables 
obtained by varying b over this range. This method, which is illustrated in §7 gives an idea 
of the accuracy to which the integral has been determined, but no rigorous statement on 
accuracy can be made without appealing to some a priori knowledge about f(x). It is hoped 
to deal with this aspect more fully in the next paper. 


7. THE CALCULATION OF THE x-DISTRIBUTION FOR 10 DEGREES OF FREEDOM 
As an illustration of the preceding section, we will now calculate the y-distribution for 
10 degrees of freedom. This distribution has high contact at either terminal and, although it 
is known to start at x = x = 0, we shall treat it as a distribution of double infinite range, i.e. 
we shall not make direct use of the information that f(z) = 0 for x < 0, and choose a truncated 
range a<x<b. 
We have a mean ci yw, = 3-0843 2776, and the moments about the mean are given byt 
He = 0-486 9223, v4, = 0-0806720, yu, = 0-7132999, x, = 0-3866784, x4 = 1-810 4865. 


* A formula for S(r,h) when the range is infinite will be given in the second paper. 
+ These follow from the formulae for the moments about the origin which are ratios of J-functions 


(see, for example, Kendall, 1945, p. 55). Note that we have used 4 and yp’ for moments about the 
origin and the mean, respectively. 








ions 
the 





H. O. Hartiey ann 8S. H. Kuamis 347 


The standard deviation is ju, = 0-7, and with seven group intervals available to cover 
the essential range we should choose h of the order of the standard deviation.* Our first 
attempt is, therefore, (a) h = 0-8. 

(a) If we make the mean of z the centre point of the innermost interval we have for the 
truncated range a = 4, —3-5 x 0-8 = w,— 2-8 and b = 4, + 2-8. For the standard variate X, 
the origin X = 0 will coincide with the mean of z and its range will be —3-5< X < + 3-5. 
Calculation of the moments (M,) of X and substitution in the matrix (22) yields the following 
answers for the frequencies /;: 


fy = 90-0005, fg = 0-03325, f, = 0-26266, f, = 0-42471, 
5 = 023196, f, = 0-04206, f, = 0-00487. 


The calculated frequency (f,) for the interval ~,+2-0<2<jy,+ 2-8 is about 0-005, and its 
contribution to 4, about 0-005 (2-4)*~ 1. Since this is an appreciable proportion of yu, it is 
unlikely that the frequencies beyond b = ~, + 2-8 when substituted in (27) can be neglected, 
i.e. 6 and h are too small. 

(6) Choosing therefore a larger h, we try h = 1. If we still keep the mean in the centre of 
the truncated range we have a = w,—3-5 = — 0-42 and b = 4, +3-5 = 6-58 (we know, of 
course, that f(z) = 0 for x = 0 so that our /, will really be the frequency for the interval 
0<2z<0-085). This time the standard variate is X = x—, so that M, = yu, and the above 
values can be substituted directly in the matrix (22) yielding the comparison of calculated 
x-integral and ‘exact’ x-integral as shown in Table 2. 


Table 2. Comparison of calculated and exact values of the x-integral 








X=xX-my P(x) exact P(z) calculated Difference 10-5 
—2-5 0-000 06 0-000 11 — 5 
—1-5 0-009 29 0-008 93 36 
—0-5 0-244 66 0-244 75 - 9 
+05 0-767 67 0-767 85 —i8 
+1-5 0-979 02 0-978 88 14 
+ 2-5 0-999 45 0-999 47 — 2 
+3-5 1-000 00 1-000 00 0 




















The maximum error is about 0-0004 and, again, the terminal f; have a higher decimal 
accuracy. In practice, of course, the exact distribution would not be available for com- 
parison. This time the termjnal value f, is about 0-0005 and represents the frequency for 
the interval 4, +2-5<2<y,+3-5. Its contribution to 4, is about 0-4, thereby confirming 
that the previous grid of group intervals was too fine. To obtain further confirmation on 
the tail of the distribution, we determine a third set of f; by shifting the grid of group intervals 
by 0-5 to the right, retaining the interval 4 = 1. This will make a = »,—3 and 6 = », +4, 
ie. 0-08 <2 < 7-08. For our standard variate X the origin will now coincide with y, + 0-5. 


* An unsuitable choice of h would, later, fail to satisfy the checks of internal consistency. 
+ Comparison with the exact x-distribution shows that the maximum error in the above /; is never- 
theless not more than 0-005. 











348 A numerical solution of the problem of moments 
The values of the My, are as follows: 


M,=—0-5, M, =0-7369223, M, =—0-7747114, 
M, = + 1-3448394, M, = —1-834794, M, = 3-595 7606. 

Substituting these in the matrix (22) we obtain the following values of f;: 

f, =9-00015, f,=0-07012, f,=0-44430, f, = 0-40421, 

fs = 0-07699, f,=0-00419, f, = 0-00004. 
The comparison of the progressive sums of the above f; with the exact y-integral is of similar 
accuracy to that in Table 2. The terminal frequency for 4,+3<2<y,+¢4 is 0-0005 with 
@ contribution of about 0-03 to 4%, indicating that we have now reached a satisfactory 
choice of b. 

As a final check on the internal consistency we compare the answers obtained with the 
two last choices of group intervals by merging the tables of P(x) to obtain one table at 
interval 0-5. This is set out in Table 3. The differences provide a fair check on the internal 
consistency to about 3-decimal accuracy of the two separate tables. If a more reliable check 
is desired, three or even four separate tables may be computed, all at the same group interval 
h and merged in the above manner to form a single table at interval 4h or }h. This procedure 


has the added advantage that interpolation difficulties at the wide interval of h are being 
avoided. 


Table 3. x = x for 10 degrees of freedom. Calculated table of P(x) obtained 
from two separate grids of growp intervals (h = 1) 


T— fy P(x) 
— 2-5 0-0001 
1 
—2-0 0-0002 86 
67 442 
—155 0-0089 528 
615 601 
—1-0 0-0704 1129 
1744 — 174 
—0°5 0-2448 955 
2699 — 1122 
0 0-5147 — 167 
2532 — 855 
0-5 0-7679 — 1022 
1510 112 
1-0 0-9189 — 910 
600 480 
1-5 0-9789 — 430 
170 296 
2-0 0-9959 — 134 
36 103 
2-5 0-9995 —- 31 
5. 
3-0 1-0000 


8. THE SPECIAL CASE OF SYMMETRICA-.. DISTRIBUTIONS; THE NORMAL INTEGRAL 


By placing the origin of the standard variate X at the mean of a symmetrical distribution 


we obviously have f/f, = fii, fe = fp, ete., ic. the number of unknowns is halved. On the 
other hand, the odd moments contribute the meaningless equations 


2f(55+ Frio) = Uf, x 0 = 0. 





on a & 


ion 
the 





H. O. Hartuey anv 8. H. Knamis 349 


With the number of unknowns and equations halved and with even moménts only retained, 
it is necessary to work out a new matrix (Vz say) based on even-order moments only. In 
practice the important values of R are R = 4,6, 8 and 10, and we dre giving below the inverse 
matrix Vs 1 (for R = 8) having rank 5 (as there are five equations corresponding to jo, /s, 
Ha» [tg and jg): 

tf; My=1 M, M, M, M, 

hh 0-000 3441 —0-001 7857 0-001 2153 —0-0002315 0-000 0124 

fs —0-003 9874 0-0208333 —0-0137153 0-0023148 -—0-0000868 

0-0224151 —0-1190476 0-0711806 —0-008 1019 0-000 2480 


fe —70-088.4281 0-5000000 —0-143 4028 0-0127315 —0-0003472 
ts 0-569 6563 —0-4000000 0-0847222 —0-0067130 0-000 1736 


OP we o. 
Pa 


(29) 
Working rule: Each f; is obtained by forming the sums of five products using the five coefficients in 
the ith line and applying them to Mg, ..., M3; 6.g. f; = 0-000 3441M, —...+ 0-000 0124M,. 


As an example we compute the normal integral from its first five even moments, p, to 4, 
choosing h = 1 and the standard variate X as normal deviate. Substituting, therefore, in the 
matrix (29) M, = 1, M, = 1, M, = 3, M, = 15 and M, = 105, we obtain the five f; which in 
Table 4 have been progressively added to form the ‘calculated normal integral’ to be com- 
pared with the ‘exact’ one. The accuracy is remarkable, the maximum error being 15 in 
the 6th decimal. 


Table 4. Comparison of calculated normal integral with exact normal integral 








e=X Exact P(x) Calculated P(x) Difference x 10-* 
--4 0-000 032 0-000 034 - 2 
-3 0-001 350 0-001 342 8 
-—2 0-022 750 0-022 765 —15 
-1 0-158 655 0-158 643 12 
0 0-500 000 0-600 000 0 




















With symmetrical distributions we cannot, of course, shift the grid of group intervals, 
as otherwise we would lose the symmetry relation between the f,. If, therefore, intermediate 
values of P(x) are required in order to ease subsequent interpolation, we can achieve this only 
by altering h. Merging the answers obtained from (say) three different A grids all centred 
at x = 0 (e.g. h = 0-9, 1-0 and 1-1), we would not obtain a table of P(x) at an equidistant 
interval. In the internal check we would, therefore, use divided differences: 


°9. DIVERGENT OR POORLY CONVERGENT MOMENTS; THE 
t-DISTRIBUTION FOR 10 DEGREES OF FREEDOM 


Some variates with infinite range have distribution functions with low contact at x = 0, 
ie. the convergence in lim f(x) = 0 . (30) 
za 
is slow, indeed, in some cases the moment p, is divergent for, say, r > R’. 
As an example we have investigated the t-distribution for 10 degrees of freedom. Here we 
have f(x) = c(1+é/10)-* and hence R’ = 10. In this case, therefore, R’ is known a priori. 
If no such mathematical information is available, warning of low contact is given by the rapid 





350 A numerical solution of the problem of moments 


growth of the moments as rR, provided R is near to R’.* For our example for the 
t-distribution we find 


Mg = 1-25, fg = 6-25, ug = 78125, pg = 2734-375. 

The difficulty with such distributions is that artificial truncation is not justified if the 
high-order, poorly convergent moments are to be used in equations (6). The remedy in such 
cases is the square variate transformation y* = 2. Sometimes it may be necessary to use a 
higher power y* = x. Obviously, if we were to take an equidistant interval for y, the group 


integral for x will grow with the square law, thereby absorbing the slowly convergent tail 
end of f(x). 


Now, obviously, the moments of y are simply related to those of x; we have 


i) ” afla)dx = 2  yufty?) dy, (31) 
0 0 


or introducing the new distribution function g(y) = 2yf(y*), we have 


\° af (x) dx -{° y*"g(y) dy. (32) 
0 0 


Applying now the previous method to g(y) it is further necessary to avoid using the poorly 
convergent high-order moments. In the case of the ¢-distribution, instead of taking 
r = 0, 2, 4, 6 and 8, we take the absolute momentsf for r = 0, 1, 2, 3 and 4, which, according 
to (32), correspond to the even moment of g(y). If only even moments about the origin are 
used in the determination of the f;, the matrix (29) gives the appropriate inversion. Using 
h = 0-6 for the y-group interval we substitute in (29): 


M,=1, M,=2-401906, M, = 9-645062, M, = 52-952032 and M, = 372-108 863. 
We thereby obtain five values of f; (i = 1, ..., 5) of the form 


(6—i)h (6—i)*h? 
hw | oly) dy = ) fle)de. (33) 
{5—«)*n* 


The progressive sums of these are compared with the corresponding values of the exact 
t-integral in Table 5. Although the accuracy is lower than in the previous example it is 


satisfactory and very much better than we could have obtained without applying the trans- 
formation y? = zx. 


Table 5. Comparison of calculated and exact values of the t-integral 








t P(t) exact P(t) calculated Difference x 10-* 
5-76 0-0001 0-0001 0 
3-24 0-0044 0-0042 2 
1-44 0-0902 0-0905 — 3 
0-36 0-3613 0-3648 —35 
0-00 0-5000 0-5000 0 




















* If R is much smaller than R’, the present difficulty will not arise at all. 

+ We shall show in a second paper that, if the absolute moments of a distribution are not known, they 
can be obtained by interpolation between the values of log #, for r = 2, 4, 6, 8, etc.; in fact, we shall give 
a general discussion of the interpolability of the logarithmic moment function for positive x. 








S2gnnzazZzArne 





1e€ 


o's 





H. O. Hartiry anp 8S. H. Kwamis 351 


10. Lack OF HIGH CONTACT AT THE START OF THE VARIATE =a 


We confine ourselves here to the most important case of lack of high contact at one terminal, 
say the start of the distribution, and assume, therefore, that there is high contact at one end 
of the range. 

Without loss of generality we assume that a = 0, i.e. x >0, and introduce the new variate 
y* = x, k>2. Whence we have 


b ouk 
[ "ef(a)dz = | ~ valy)dy, (34) 


where g(y) = ky*—1f(y*). Obviously g(y) has, at least, contact of order k— 1 at the start y-= 0; 
further, if f(z) has contact of order m at x=), ie. if f(x) = O(a-™) at x= 5, then 
g(y) = O(y~"-»*-). Hence there is high-order contact, of order kK—1 and (m—1)k+1, 
respectively, at both ends of the range. The previous method is therefore applicable to g{y) 
provided we can obtain its moments from those of f(x). It is obvious from (34) that in order 
to obtain the ordinary moment of g(y) we require to know the ‘fractional’ moments of f(x), 
i.e. those corresponding to r = j/k (j = 0,1,...). If the moments of f(x) are only known for 
integer r the fractional moments will have to be obtained by interpolation of the logarithmic 
moment function log », which will be more fully discussed in the next paper. 


REFERENCES 


Bart ett, M. 8. (1937). Proc. Roy. Soc. A, 160, 268. 

FisHer, R. A. (1929). Proc. Lond. Math. Soc. (2), 30, 199. 

Fisuer, R. A. (1930). Proc: Roy. Soc. A, 130, 16. 

Geary, R. C. (1947). Biometrika, 34, 38. 

Geary, R. C. & WortiepeGs, J. P. G. (1947). Biometrika, 34, 98. 

Harttey, H. O. (1940). Biometrika, 31, 249. 

KENDALL, M. G. (1938). J.R. statist. Soc. 101, 592. 

KeEnbD4LL, M. G. (1945). The Advanced Theory of Statistics, 1. London: C. Griffin and Co. 
MiinE Tomson, L. M. (1933). The Calculus of Finite Differences. London: Macmillan. 
Narr, U. S. (1936). Biometrika, 30, 274. 

Naver, P. P. N. (1936). Statist. Res. Mem. 1, 38. 

Neryman, J. & Pearson, E. 8. (1931). Bull. int. Acad. Cracovie, A, p. 460. 

SuxHatmgE, P. V. (1936). Statist. Res. Mem. 1, 94. 

Tables of the Incomplete Beta-Function, edited by Karl Pearson (1934). London: Biometrika Office. 
We cs, B. L. (1935). Biometrika, 27, 145. 

We cu, B. L. (1936). Statist. Res. Mem. 1, 1. 








[ 352 ] 


APPROXIMATION TO PERCENTAGE POINTS OF 
THE z-DISTRIBUTION 


By A. H. CARTER, King’s College, Cambridge 


Tables have been published of the values of z for various percentage levels (20, 5, 1 and 0-1 %) 
for a range of given n,, n, (Fisher & Yates, 1943, Table V). When n, or n, is outside the range 
of the tables, recourse must be had to approximate formulae (unless, of course, interpolation 
is sufficiently accurate) which will combine accuracy with facility of computation. One 
such formula, due to Fisher, with a modification suggested by Cochran (1940), is given at 
the foot of the above-mentioned tables. The purpose of this paper is to derive an alternative 
formula, no more difficult to compute, which will be shown to give consistently closer 
approximations to the true value of z for all except small n, or ng. 

Wishart (1947) has derived formulae for the exact cumulants of z, and also the well-known 
approximations to them when n,, n, are large. The exact cumulants as far as x, can be readily 
obtained arithmetically from tables of the Polygamma functions. Knowing the cumulants 
of the distribution, we may make use of the Cornish-Fisher normalization function method, 

‘based on Edgeworth’s form of the Gram-Charlier type A series (Cornish & Fisher, 1937), to 

approximate to the percentage points. The method consists in writing z as an expansion 
in powers of a corresponding normal variate, £, the coefficients being functions of the 
cumulants of z, and assumes that x, is of order n'-*, which is true for the z-distribution 
(Wishart, 1947, p. 172). 

If z and & are expressed in standard measure (i.e. mean-zero, standard deviation unity) 
we then derive z— fy K3f2@—1 «,E8—3E 2 235—5E 

eo ~*btag ta mot 86 





correct to order n-! for z’. This gives 


KyGt—1 kg —3E_ x9 2E°—5E a 
eo 6 oo 2 oo 36’ 
correct to order n-? (since ¢ = O(n-*)), where 4;(= k,), o*( = Kg), Ks, K, are cumulants of 
the z-distribution. The £-coefficients may be readily computed: e.g. for the 5 % level, sub- 
stitute £ = 1-64485. Table 2 gives the values, for the 20, 5, 1 and 0-1% levels, of the coefficients 
required in applying the formula. The quantities 7}, 0, Ks, K, depend of course on n, and n,, 
and may be evaluated in any particular case, whence substitution in (1) gives the appropriate 
value of z. Since | z,_ p(m,,)| = | zp(m,”,)|, where zp is the value of z corresponding to 
probability P, to find the percentage points for the ‘negative tail’, i.e. 80, 95, 99 and 99-9 %, 
we may simply interchange n, and n,. This has the effect of changing the sign of the odd 
cumulants, so that in (1) we write — yw; and —«, for mw; and «3. 
Formula (a), being an approximation to order n-#, may be expected to give reliable results 
when n, and n, are both large. For the 1 % point, for example, we find z(6, 12) = 0-7843 


Formula (a): z~pt+ob+— 





(true value 0-7864), whereas.z(24, 60) = 0-3744 (true value 0-3746). Some further results for 


(6, 12), (6, 60), and (24, 60) are shown in Table 1 (a). 

In practice, some labour is involved in applying formula (a), even if polygamma tables 
are available. The Fisher-Cochran formula, derived by the normalization function method, 
is a simple working approximation, valid for large n,, n,, in which the exact cumulants are 
replaced by their approximations in terms of inverse powers of n, and n,. 





eo 











71 








PO asd 5 





w 








A. H. Carter 353 


Table 1. Comparison of approximations to the percentage points of z 


Formula (6): Existing formula (Fisher-Cochran). 
Formula (c): New formula. 








Per- 
centage M,N, > 6, 12 6,60 | 24,60 | 20,36 | 20,100/ 36,60 | 24,24 | 36, 36 
level 
20 Formula (6) 0-2687 | 0-1901 | 0-1335 | 0-1577 | 0-1287 | 0-1212 | 0-1741 | 0-1415 
Formula (c) 0-2733 | 0-2020 | 0-1340 | 0-1580 | 0-1298 | 0-1213 | 0-1740 | 0-1415 
True z 0-2706 | 0-1965 | 0-1338 | 0-1579 | 0-1294 | 0-1213 | 0-1740 | 0-1415 
5 Formula (6) 0-5507 | 0-3990 | 0-2650 | 0-3128 | 0-2573 | 0-2390 | 0-3426 | .0-2778 
Formula (c) 0-5501 | 0-4100 | 0-2654 | 03129 | 0-2586 | 0-2391 | 0-3426 | 0-2778 
True z 0-5487 | 0-4064 | 0-2654 | 0-3129 | 0-2583 | 0-2391 | 0-3425 | 0-2778 
1 Formula (6) 0-7992 | 0-5646 | 0-3746 | 0-4441 | 0-3619 | 0-3385 | 0-4894 | 0-3955 
Formula (c). 0-7886 | 0-5698 | 0-3746 | 0:4435 | 0-3629 | 0-3384 | 0-4893 | 0-3955 
True z 0-7864 | 0-5687 | 0-3746 |. 0-4435 | 0-3630 | 0-3384 | 0-4890 | 0-3954 
0-1 Formula (b) 1-1074 | 0-7474 | 0-4963 | 0-5928 | 0-4755 | 0-4503 | 0-6602 | 0-5307 
Formula (c) 1-0693 | 0-7372 | 0-4954 | 0-5906 | 0-4756 | 0-4498 | 0-6595 | 0-5304 
True z 1-0628 | 0-7377 | 0-4955 | 0-5905 | 0-4760 | 0-4498 | 0-6589 | 0-5302 

















Per- 
centage Ny, hy > 12, 6 60, 6 60, 24 | 36,20 | 100, 20/| 60, 36 
level 
20 Formula (6) 0-3509 | 0-3346 | 0-1566 | 0-1783 | 0-1656 | 0-1314 
Formula (c) 0-3506 | 0-3408 | 0-1569 | 0-1785 | 0-1665 | 0-1315- 
True z 0-3510 | 0-3388 | 0-1568 | 0-1784 | 0-1661 | 0-1315 
5 Formula (6) 0-6884 | 0-6435 | 0-3047 | 0-3483 | 0-3208 | 0-2566 
Formula (c) 0-7001 | 0-6706 | 0-3060 | 0-3493 | 0-3236 | 0-2570 
True z 0-6931 | 0-6596 | 0-3055 | 0-3488 | 0-3227 | 0-2568 
1 Formula (5) 1-0120 | 0-9444 | 0-4368 | 0-4995 | 0-4615 | 0-3661 
Formula (c) 1-0370 | 0-9956 | 0-4391 | 0-5016 | 0-4662 | 0-3667 
True z 1-0218 | 0-9770 | 0-4385 | 0-5009 | 0-4666 | 0-3666 


0-1 Formula (6) 1-4352 | 1-3340 | 0-5930 | 0-6789 | 0-6303 | 0-4932 
Formula (c) 1-4681 | 1-4155 | 0-5965 | 0-6820 | 0-6375 | 0-4942 
True z 1-4449 | 1-3929 | 0-5962 | 0-6814 | 0-6371 | 0-4940 





























Table 1 (a). Some values of z from formula () (exact cumulant formula) 
(For corresponding true values, see Table 1) 








Mine de 6, 12 6, 60 24, 60 12, 6 60, 6 60, 24 
% > 

20 0-2699 0-1998 01338 0-3499 03335 01567 

5 05457 0-4022 02652 0-6958 0-6627 0-3057 

1 0-7843 0-5640 0-3744 1-0295 0-9854 0-4388 

0-1 1-0684 07433 0-4956 1-4592 1-4026 05966 
































354 Approximation to percentage points of the z-distribution 


The cumulant function of z is 


x= (2) nr) (8). 


and the cumulants are chain by differentiating this successively with respect to (it), 
at each stage putting ¢ = 


" n+it 4 n 
Since fe Be 5 )I., = (+1) aloe! (5): 


and log /"(4n) can be expanded by Stirling’s theorem in inverse powers of n, the cumulants 
may also readily be expressed in inverse powers of n,, n,; and, when ,, n, are reasonably 


large, the first few terms only in the expansions will give sufficiently close approximations 
to them. In fact, writing ts 








Pict a ie 
it has been shown that = py = —}d—6sd + O(n-), 
Ky = o7 = $8 + }(s?+d?) + O(n-), 
Ke = —}sd + O(n-*), (A) 
K4 = }8(s? + 3d?) + O(n-‘), | 
K, = O(n!) (r>1). 


Formula (a) will now have an extra term, since we take as our ‘working variance’ of z 
not its exact value, but its approximation to order n— from (A), i.e. 4s. In the notation of 
Kendall (1945) l, = k_/}s—1~ }(s +d?/s). 

We then obtain 


(s)e-Ge-n+ [alg @+30+ 79, + 18} a 


g 1 £27+3 dad /2 
= eel! tapes Ot rea st 
d 
or *—Mh~ TGR —ay oF) (3) 
where hw, awk. 
8 6 


provided 7}, d*.,/(2/s)(€3+11£) may be neglected (which will be the case for small d). 
Inserting the approximation to ; from (A), ie. — 4d, 


t~ TE get aa 


the Fisher-Cochran formula, which has, in fact, been found to give a fairly close approxima- 
tion to the true z for n,, n, both reasonably large. It may be noted that if n,, n, are not very 


large, an improvement will be effected by including the second term in the estimate of the: 


mean (x;), ie. from (A) by adding — }sd. 
For (n,,,) = (6,12) this correction is —0-00347, and for (24,60), the correction is 
— 0-00024. Inserting this improved approximation to y; in (3) we have 


Formula (b): ~ Fy pig et ete). (3b) 





), 





A. H. Carter 355 


As pointed out by Wishart (1947, p. 179), an approximation to the value of any x,(r> 1) 
obtained by considering its leading term only, will be improved by writing 1/(n,—1}) and 
1/(m,—1) in place of 1/n, and 1/n,. For, by Stirling’s expansion of a factorial, 

log (5) = me log 5 — 2 + flog 2n+ = + O(n-*), 


d 1 
er(§) - slogn—- on — go 5+ O(n ) 


d? n ae 1 = 
with sliecoe zat O(n) 
= 


a Se 
































and so on. 
Table 2. £-coefficients required in applying formula (a) 
20% 5% 1% 0-1% 
g 0-84162 1-64485 2-32635 3-09023 
x(€%—1) —0-04861 028426 0-73532 1-42492 
vi(€? — 3£) —0-08036 | —0-02018 0-23379 0-84332 
3e( 28 — 5E) —0-08377 0-01878 0-37634 1-21026 
— 1 1 ; 1 1 
Thus writing 3 = $+———s, ee ees 
m—1 n—1 m—1 n,—1 


we might expect a better approximation to z to be obtained, corresponding to that of Fisher 
and Cochran, if we use s’ and d’ instead of s and d. 
Corresponding to equations (A) we have 
Ky = $8’ — gs'(s' + 3d) + O(n), 
Kg = —}8'd'+ O(n), 


(B) 
Ky = }8’(s'2 + 3d’*) + O(n), 
K, = O(n'*) (r>1). 
For the mean, however, Bi = K, = —4d—}ed+ O(n), (4) 
= —}d’+ 43d’ + O(n), (4a) 


If n- is not negligible (relative to the degree of accuracy desired) ; should therefore be left 
in the form — 4d—jsd. 
Proceeding as before, we obtain 


nim [(t)efe—nse [()esear [EE w 


whence *—M~ TET (€?—1), (6) 


where h’ = 2/8’, A’ = (€?-3). 














356 Approximation to percentage points of the z-distribution 


Since this is based in the first place on more accurate approximations to the cumulants 
K, and x,, and since the term omitted from (5) in deriving (6) (i.e. ~}, d’?./(2/s’) (—7£), 
is evidently numerically less than the corresponding term omitted in obtaining the Fisher- 
Cochran formula (i.e. :},d*./(2/s) (é*+11£), formula (6) might be expected to give an 
improved approximation to z. In fact, however, it does not, and the reason is not 
far to seek. 


Consider the expansion of £/,/(4—A) in both cases: 


Tem ~ alana): 


2h 4h* 


where the terms are decreasing in magnitude (since 1/h = $s = O(n-")). Hence the error 
in neglecting all terms after the second will be approximately of the order of the third term. 


Now in the Fisher-Cochran approximation this term, + pan J are has the same sign as the 
omitted term z},d?./(2/s) (+ 11£) (both being of the same sign as £), so that the extra terms 
included will tend to compensate for the term omitted. In obtaining (6) from (5), on the 
other hand, the term omitted, ;},d’?,/(2/s’) (&°— 7), will be of opposite sign to £ when 
| | <./7, corresponding to a probability of about 0-004: so that for most percentage levels 
encountered in practice, the error in (5) is increased in (6). 
A better formula is obtained from (5) as 
. h'+2’') @ 

2—p,~ El) _¢ ey, (7) 
where h’ and A’ are as in (6). 

Expanding £,/(h’ +A’)/h’, it is found that the third term is now of opposite sign to £, and 
hence the extra terms contained in the expansion will tend to compensate for fhe term 
omitted. Since s’ and d’ require to be calculated in applying this formula, it is desirable to 
write 4; in the form (4a) (provided we can neglect quantities of order n-*). This gives 

Formula (c): an a tant d © (2 4+2—28'). (7a) 

Collecting the results, we have the three approximate formulae: 

Formula (a) (exact. cumulant method): 
2 
awk, +06 +28 — 1 Ka S— BE xz 2e%— 55 


6 o* 24 ao 36 





Formula (6) (Fisher-Cochran formula): 


£g 





where ay ve ik bis: a=-2t3 
Nm Ne NM 8 6 
Formula (c) (new formula): z~ BS dat . — (€? + 2 — 28’), 
where ne ee We iP tae cane ris. vat 











forn 


whi 


Sin 








A. H. Carter 357 


When n,, ”, are large (and not too different), $sd is negligible and formula (b) becomes the 
formula more generally quoted 
-(-- x)e g+2 
ha) = A) 


which may be written HR (~-= (A—}4). 


Similarly, for sufficiently large n,, n., $s‘d’ may be neglected and formula (c) becomes 
2 ee \e+2 











h’ m—1 nm —l) 6 ’ 
_oh'+4’) (1 , 
or 7% “ia —} (A’ +§). 


It is to be noted, however, that since }s’d’ is approximately twice $sd, more care must be 
exercised in deciding to neglect it. For example, when (n,,”,) is (20, 100), $s’d’ = 0-0009, 
and for (24, 60), its value is 0-0005. 

¥or purposes of comparison, values of z have been computed from formulae (6) and (c), 
for the four common percentage levels, over a fairly wide range of n,, »,. They are shown in 
Table 1, together with the corresponding true values of z. The latter were obtained where 
possible from the tables of Fisher and Yates: elsewhere by inverse interpolation in Tables of 
the Incomplete Beta- Function followed by a logarithmic transformation. Such values are in 
error by not more than 0-0001. It will be seen that neither formula yields very accurate 
results when n, or 7, is as small as 6, though even here the new formula is rather better with 
the single exception of n, = 12, n, = 6. In actual practice, however, we are concerned with 
large values of n,, n,, beyond the range of the published tables. Considering only those cases 
where n, and n, are both greater than 20, it is seen that formula (c) gives a consistently closer 
approximation than does formula (b) for both the positive and the negative tails, and for 
all the percentage levels investigated, though its relative gain in accuracy is greatest at the 
1 and 0-1 % levels.. It may be noted, in fact, that in no case considered having n, and n, 
greater than 20, is the error more than 9 in the fourth decimal place, i.e. it appears 
that for all except small ,, n., this formulg will give an approximation to z correct to 
within 0-001. 

In conclusion, therefore, it is recommended that formula (c) be adopted for general use, 
since it is no more difficult to compute, and is more accurate, than the existing formula. 
Dropping the dashes we have the formula 


on oh +A) _ ( ae )(a+3-2), 








h m,—1 n,-1 3 
1 1 2 —3 
where "nk ae? hus = Saw 
1 
or, if eT in ra may be neglected, 





EJ(h+A) 1 1 
en Elhed)_( 








358 Approximation to percentage points of the z-distribution 
The values of £ and A for the four percentage levels are: 








20% 5% 1% 0-1% 
E 0-8416 1-6449 2-3263 3-0902 
A — 0-3819 0-0491 0-4020 1-0916 























My thanks are due to Dr J. Wishart, whose suggestioa was the basis of this paper. 


REFERENCES 


Cocnran, W. G. (1940). Nete on an approximative formula for significance levels of z. Ann. Math. 
Statist. 11, 93. 


Cornisu, E. A. & Fisner, R. A. (1937). Moments and cumulants in the specification of distributions. 
Rev. Inst. Int. Statist. 4, 307. . 

FisHEer, R. A. & Yarss, F. (1943). Statistical Tables, 2nd ed. Oliver and Boyd. 

KENDALL, M. G. (1945). Advanced Theory of Statistics, 1, 156, 2nd ed. C. Griffin and Co. 


WisHakrt, J. (1947). The cumulants of the z and of the logarithmic x? and ¢ distributions. Biometrika, 
34, 170. 











a, 





[ 359 ] 


MISCELLANEA 


Note on the cumulants of Fisher’s z-distribution 
By LEO A. AROIAN, Hunter College 


In a recent article Dr J. Wishart (1947) stated: ‘Explicit expressions for the exact cumulants of Fisher’s 
2-distribution do not appear ever to have been published.’ Fisher's z-distribution and the related Snede- 
cor’s F'-distribution formed a part of my doctor’s thesis and rather full results concerning the cumulants 
of ths z-distribution and other properties of the distribution were published in the Annals of Mathe- 
matical Statistics (Aroian, 1941) some time ago.* I should like to take this opportunity of adding certain 
comments on the Gram-Charlier Type A approximation to the z-distribution and the type III approxima- 
tion to the F-distribution. 

To obtain the cumulants of the z-distribution I expanded the moment generating function M,(9) in 
powers of 8 and found A,.,, the kth semi-variant (or cumulant) of z as the coefficient of 0*/k!. The exact 
results correspond with Wishart’s formulae (9) to (15), although given in a different form, and need not 
be repeated here. In addition, asymptotic formulae for A,.,, n, and n, large, were derived by means of 
the Euler-Maclaurin sum formula. Furthermore, another type of formula could have been given for 
n, small but n, large, merely by expanding that part of A,., in which n, occurs by the Euler-Maclaurin 
sum formula. The special cases for the logarithmic y*, the logarithmic ¢, and the logarithmic normal 
probability functions follow by substituting the proper limiting values of n, and n,. 

In my previous paper I was overcautious concerning the type A approximation to the z-distribution. 
Actually the method is fairly accurate although tedious. Taking 


F(t) = ${t)+ Aso") + A,P"(0, f F(t) dt = 9, 
we have ) ” Fit) dt = [se dit+ P(t) {—A,(G— 1) + A,(G—3ty)}, 
to to 


where 7 is usually 0-10, 0-05, 0-025, 0-01, etc. As an example take n, = 24, n, = 60; then 


Ause = —0-0127429, o, = 0-173779, Aj. = —0-0007998, A,., = 0-0000867, 


: Ay: 
_ 3:2 _ on, = are = Ud le 
43-5 i = 0-025345, A, = 7% = 0-00306 








t) for 7 = 0-05 is 1-60094, z,., = 0-26547 against the accurate value of 0-26534. For the 1% point 
ty = 2-2338, 299, = 0-3754 against the accurate value of 0-3746. When n, = n, = 24, 299, = 0-3423 
against the accurate value of 0-3425. 


The type III approximation to the F’-distribution is of some interest since for n, moderate and nm, 
large, n,F tends to be distributed as y* with n, degrees of freedom. Since 


2(n, + _— 2) 
Mean F = F=—"*_; og, =—"! Poe 
N,—2 ™ Ng ,(N_— 4) 


_ 4(2n,+n,—2) | n,(n,—4) = 
As.7 = n,(%,—6) A 2(n, +n,—2)° as:7 = VAi: 2)» 


we find the 5, 1 or 0-1 % points for F by using 
Foos = F +0(1-64485 + 0-28392a,4. » — 0-04902a3. »), 
Foo, = F +0(2-32635 + 0-73330a5. » — 0-024957a3. »), 
Foon = F +op(3-0903 + 1-4190a,. » + 0-05667a3, »). 








* [Both Dr Wishart, as author, and myself as editor regret that owing to wartime preoccupation the 
publication of Dr Aroian’s 1941 paper was overlooked. E.S.P.] 


Biometrika 34 24 





360 Miscellanea 


These formulae for the levels of significance of the x* distribution are from a previous paper (Aroian, 
1943). For n, = 24, n, = 60, Fo, by this approximation is 1-709 compared with the accurate value 
of 1-700. For n, = 24, n, = 100, Fy; by this approximation is 1-631 against the accurate value of 1-627. 
For n, = n, = 100, F'y9, by this approximation is 1-394 as compared with the accurate value of 1-392. 
While these results are not too poor, they are not so accurate as the well-known formulae of Cochran- 
Fisher or of E. Paulson (1942) which, for large values of n, and n,, generally give 4 significant figures. 


REFERENCES 


Aroran, L. A. (1941). Ann. Math. Statist. 12, 429. 
Aroran, L. A. (1943). Ann. Math. Statist. 14, 93. 

Pautson, E. (1942). Ann. Math. Statist. 13, 233. 
Wisnart, J. (1947). Biometrika, 34, 170. 


A note on the mean deviation from the median 
By K. R. NAIR 


For samples drawn from a normal universe, Godwin (1945) obtained the sampling distribution of the 
mean deviation when the individual deviations are measured from the sample mean. It is well known that 
the mean deviation is least when it is measured from the sample median.* Let us refer to them as ‘mean 
deviation from.mean’ and ‘mean deviation from median’ respectively, and use the letters m and m’ to 
denote their sample estimates. 

The exact sampling distribution of m being now known and its probability integral tabulated, the 
question may well be asked what the distribution of m’ is. Since m’<™m, their expectations have the 
same relationship E(m’\< E(m). (1) 


For samples of n from a normal population with standard deviation, 7, E(m’) = f.o and E(m) =f,¢, 
where f, <f,. For getting unbiased estimates of ¢ we should divide m’ by f, and m by f,. What we are 
now interested to know is which of the two estimates has a smaller standard error. In the case of m, it 
has been shown by Helmert (1876) and Fisher (1920) that 


_ [2(n—1) 
f= />, (2) 


and s.Z. of (7) = Foal | (G+ vent — n+ sina i|]- (3) 


In the case of m’, we neither know /; nor the standard error of (m’/f.) for samples of size n. 

It is obvious that when n is very large, the mean and median will differ very little from one another 
and hence m’ +m and f,->/,. It is interesting to note that, at the other end of the scale, namely, when 
n = 2, m and m’ are identical, and equal to one-half the sample range. 

To discover any real difference that may exist between the standard errors of (m/f,) and (m’/f.), 
which is the same as determining the difference between the coefficients of variation of m and m’, we must 
consider samples of size greater than 2. 


(i) Let us take n = 3, and let x,, 23, «, be the observed values arranged in order of ascending magnitude. 
We at once find that 








m’ = 4(x,—2,). (4) 


The distribution of m’ for samples of 3 is therefore derivable from that of the range. The probability 
integral of the range has been tabulated by Pearson & Hartley (1942) for n = 2 to 20. For our purpose 


it is necessary only to know the values of the mean range (#) and the standard error of the range (¢,,) . 





* When n, the sample size, is an odd number, the sample median is by definition the value of the 
4(n + 1)th ranked observation. When n is even, the sample median is conventionally taken as the mean 
of the jnth and }(n+2)th ranked values. The mean deviation from the median will have the same 
magnitude whatever value, between the jnth and }(n+2)th ranked values, the median takes, when 


n is even. No complication is therefore introduced by accepting the conventional definition of the 
median for even-sized samples. 








for s 
giver 


Tho 


ge 





we 


ww fF ow 


wT FT" Vv wy 





Miscellanea 361 


for samples of 3. This cant be calculated, correct to six decimal places, from certain numerical values 
given by Pearson (1926). Using his figures, 


@ = 1-692568 xa, (5) 
oO» = 0-888368 xo. (6) 
Tho value of f; for sample of 3 is, therefore, 
fy = } x 1-692568 = 0-56419, 
0-888368 


and the standard error of m’/f, is 1-602568” = 0-52486c, (7) 
correct to five decimal places. 
The corresponding values for f, and standard error of (m/f,) are obtained by putting n = 3 in 
equations (2) and (3) and are 4 
A= J 3 = 065147, (8) 
o 27 
and s.E. of (m/f,) = 5,l +8-3) = 0-52486c, (9) 


correct to five decimal places. 

Although (9) can be evaluated to any number of decimal places, we are not in a position to bring (7) 
to a higher order of accuracy than five decimal places. It is very unlikely that (7) and (9) are absolutely 
identical, but we may safely conclude that they are practically the same. 

(ii) We next come to samples of 4. If x,, 2,, 73, x4 be the observations arranged in order of ascending 
magnitude, the mean deviation from median is given by 


ml! = ay +2,—2,—2,). (10) 


The distribution of m’ follows immediately from ‘some order statistic distributions for samples of 
size 4’ obtained by Walsh (1946) and is as follows: 


’ , 12 — 2 aed _ , ’ 
p(m’) dm’ = iemP* 2m ({ e iv*ay) dm’. (11) 
The probability integral of m’ is given by 
. m’ ‘ ; am’ } RF 3 2 2m’ ya 3 
Pow = | nmaw = (led) Gay 3 


The values of P(m’) given by (12) can easily be evaluated using the normal probability integral table 
and are given in cols. (3) and (6) of the table below, alongside corresponding values (given in cols. (2) 
and (5)) for the probability integral of the mean deviation (m) from the mean, for samples of 4, copied 
from Godwin’s (1945) tables. 


Table giving the probability integral of the mean deviation from (a) mean and (b) median 
for samples of four observations from a normal universe (a = 1) 





























m (or m’) P(m) P(m’) m (or m’) P(m) P(m’) 
0-0 0-00000 0-00000 1:3 096758 0-97229 
0-1 0-00333 0-00398 1-4 0-98229 0-98475 
0-2 0-02534 0-03003 1-5 0-99073 0-99192 
0-3 0-0787$ 0-09204 1-6 099534 099588 
0-4 0-16693 0-19139 1-7 0-99775 0-99798 
0°5 0:28345 0-31818 1-8 099895 0-99905 
0-6 0-41552 0-45629 1-9 0-99953 0-99957 
0-7 0-54836 0-58951 2-0 0-99980 0:99981 
0-8 0-66934 070592 2-1 0-99992 0-99992 
0-9 0°77040 0-79954 2-2 0-99997 0-99997 
1-0 0-84860 0-86962 2-3 099999 0-99999 
1-1 0-90502 0-91888 2-4 1-00000 1-00000 
1-2 0-94321 0-95162 














362 Miscellanea 


We note that although m and m’ have an infinite range from 0 to 00, their probability integrals rapidly 
approach unity, this value being reached to five decimal place accuracy when m(m’) = 2-40. We can 
approximately work out the moments of the two distributions from the table above. The values of the 
mean and the standard deviation (applying Sheppard’s correction for grouping) of m and m’ so obtained 
are given below: 

Mean: m = 0-6909860, mm’ = 0-663187¢, | 

Standard deviation: Om = 0-297015c, Om: = 0-292979c, (13) 

Coefficient of variation: o,,/m = 0-429842, o,,-/m’ = 0-441775. 


The values of 7% and ¢,,/m obtained from the exact formulae (2) and (3) are 


ae _. ee 
m=o J (=) = 0-690988¢, na 


o,,/% = 44/(40+sin-1 $+ 2,/2—4) = 0-429842, 


showing close agreement with the values given in (13) for the mean and coefficient of variation of m. 
We may therefore consider the mean and coefficient of variation of m’, approximately evaluated in (13), 
to be of sufficient accuracy to warrant the conclusion that for samples of size 4, the mean deviation from 
the mean leads to a more ‘efficient’ estimate of the population standard deviation than the mean devia- 
tion from the median. As the distribution of the latter is not known for n > 4, we are not in a position to 
say whether this conclusion holds good, in general, for ail values of n. 
In conclusion, it seems worth making the following point: 
(a) if expressions for the expectation and variance of m’ were available and tables of its probability 
integral worked out, 
(6) if the efficiency of the m’ estimate compared to the m estimate for n > 4 was not appreciably worse 
than for the case x = 4, 
there would be strong prectical grounds for using m’ rather than m in view of greater simplicity in calcula- 
tion. In both cases we must first arrange the observations in order of magnitude. Then if z,<2,<...<2,, 
m’ may be calculated from the formula 


1 
m’ = pint + Uq_tyet +s +2y) — (2%, +%qt... +2}, (15) 


where ¢ = 4n or }(n— 1) according as n is even or odd. 
For m, however, we must also calculate the arithmetic mean % and look for x, and x,,, between which 
% lies. Then m can be obtained from one of the three formulae 











nm _ _ Mitty t.-. +2 
2k k F 
nm — Teyit---tin — 
2(n—k) n—k - f si 
nm Tepito: tly T+ -.. +Lp 
2kin—k) n—k k ‘ 


This certainly involves a rather longer process. 

It is interesting to note that m’ becomes a special case of the measure of dispersion based on difference 
between the sums of the first and the last r observations (in order of magnitude) suggested by Jones 
(1946), the range, becoming another special case of the same measure, when r = 1. 


REFERENCES 


Fisuer, R. A. (1920). Mon. Not. R. Astr. Soc. 80, 758. 
Gopwin, H. J. (1945). Biometrika, 33, 254. 

HetmeEnt, W. (1876). Astr. Nachr. 88, no. 2096. 

Jones, A. E. (1946). Biometrika, 33, 274. 

Pearson, E. 8S. (1926). Biometrika, 18, 173. 

Pearson, E. 8. & Hartiry, H. O. (1942). Biometrika, 32, 301. 
Watssa, J. E. (1946). Ann. Math. Statist. 17, 246. 





ae oe 


o 8 


4) 


ch 


16) 


ice 
16S 





Miscellanea 363 


On the method of paired comparisons 
By P. A. P. MORAN, Institute of Statistics, Oxford University 


M. G. Kendall & B. Babington Smith (1940) have discussed the ‘method of paired comparisons’ for 
investigating preferences. Suppose we are given n objects A, ...,K, and an observer is asked to choose 
between every pair. If A is preferred to B we write A > B. If the observer is not completely consistent, 
either because of his own inefficiency or because the objects are not really capable of being ranked in 
respect of the quality under consideration, he may make preferences of the type A > B > C > A, and we 
call this an inconsistent or circular triad. Write d for the number of circular triads in a given experiment. 
Then Kendall & Babington Smith show that 





24d 
C=1 “= (n odd) 
=] ~~ (n even) 
= ni—4n 


may be regarded as a ‘coefficient of consistence’ and lies between 0 and 1, being capable of attaining 
both these limits. . 


Now suppose that each comparison is made at random so that there are equal chances that A + B 
and BA. The distribution of d is then of interest. They calculate this distribution exactly for 
n = 2,...,7 and conjecture that its moments are given by 


3 [n n—3 n—3 n—3 
m= asa (s)(0("s )+99("3') +9") +7; 
these being polynomials in n which agree with their numerical calculations for n = 2, ...,7. They also 


conjecture that the distribution tends to normality when n increases. In the present note we prove 
these statements. 


Let the objects be numbered from 1 to n. Write P,;, = 1 if the triad (é, 7, &) is circular, and P,;, = 0 
if it is not. Then d = ZP,;,, the sum being taken over all such triads. Now by enumerating the various 


cases we see that E(P,,,) =} and so yi(d) = E(ZP,,,) = ‘ (3): Now consider #(4) = EU(ZP,;,)*). 


Consider the types of terms which results when we expand this. In the first place we have (3) terms 


3 
1 ’ sce . 
typified by P?,,;, and these contribute a (3) to 4,(d). Similarly, we have terms typified by P,_3Pi4s, 


3 
Py_3P 94 2nd P23 Py5,, and the number of these are respectively 3 (3) (n—3)(n—4), 3(3) (n—3) and 


(;) (" _ ‘) , whilst their expectations,are each 7g. It follows that 


3 3 
10-2) (Ch 


and so f(a) = id (3) . 








364 Miscellanea 
The calculation of (2) is a good deal more complicated. #3(4) = E((ZP;,;,)*), and on expanding we 
get 16 types of terms, typified by 
Piss PsP ras PissPiew Piss Pases 
PrssPusPier PissPisaPiss» PissPissPases Piss PissPisce 
PyasPiesPisss  PissPasaPises PissPaasPser» Piss Pos Pose 
Py23 PP 145 Pere Pry23 Pica Poe: Pry23PaseP reo: P23 PsasPeoas- 
After some calculation we find the sum of the contributions of these to be 
1 fn 
pF semen 6 5 4 pes ate 864 3 
3304 (3) {n* — 6n® + 13n4 + 42n* — 158n? — 108n + 864} 


3 [n 


The calculation of y;(d) is a great deal more complicated, there being 85 terms which are not zero; 
we finally obtain, after lengthy calculations, 


w= — (3) {n® — 9n8 + 33n? + 45n® — 582n5 + 504n‘ + 5732n* — 10692n* — 30024n + 80352}, 


Reducing to the mean we get p,(d) = 


and so = _— (3) {972n? + 972n® — 26936n + 80352}, 


which reduces to the conjectured result. 

We now prove that the distribution tends to normality. To do this, it is sufficient (Kendall, 1943, 
p. 110) to prove that 

Ham (2m)! Haws 

pe 2™m! ° pyemsy 


Consider the second of these first. Write Q,,, = P,;,—}. Then 
Pamsi(4) = E[(2Q45x)*"**)- 


It is clear that for any given m we could calculate #,,,(@) given sufficient labour, by expanding this 
and considering the expectation of each type of term and calculating the number of times it occurs, which 
will be a polynomial in n. Now consider the various types of terms in the expansion of (2Q,,,)*"+?. We 
classify these terms according to whether the Q’s have common suffixes. Let. Q;;;Qimn--- Qoer be @ 
typical product in the expansion. If this can be separated into p groups of products of Q’s such that 
different groups have no common suffixes whilst, within each group the triads are connected to each other 
by having common points, we shall say such a product ‘contains p groups’. Moreover, the number of 
times such a term occurs will be a polynomial in n whose order is equal to the number of distinct suffixes 
occurring in the product. If in a group a suffix only appears once, the inconsistency of the triad containing 
it is unaffected by the remainder of the group and the expectation of the product of Q’s in that group will 
be zero. It follows that in all those terms which contribute something non-negative to fm4,, none of 
the groups can contain a suffix which appears only once. Therefore, since all terms which contain more 
than m groups will have at least one group consisting, of a single Q; the expectation of such terms will 


be zero. It follows that 4.,,, is a polynomial in n, of degree 3m+1 =] integral part of 5 (2m+ | at 
most, whose coefficients depend on m only. But y,(d) is of degree 3 in n and so 


Poms p(n?) 
aod = o(San = On). 





>0, for m=1,2,.... 





Now consider //,,,. This is polynomial in n, anc our aim is to find the order and coefficient of the term of - 


largest order. In the first place we need only-consider terms with m or less groups, for if a term has more 
than m groups, one at least will consist of a single Q and the expectation of the term will be zero. More- 
over as before, in each term, the suffixes in each group must each occur at least twice in that group. The 
number of times each type of term occurs will be a polynomial in n of order equal to the total number of 
distinct suffixes in that term. As we shall show the leading term in //,_,(d) to be of order 3m, we can neglect 
terms whose frequency is less than this and therefore we can neglect all terms in which a suffix appears 
more than twice. Nc » consider a term with fewer than m groups and therefore containing a group of 








i" 





SD OSS SS OS ee ae SS OOD ee 





Miscellanea 365 


order greater than two in the Q’s. As no suffix can occur more than twice, no Q can occur more than once. 
Consider any Q;;,, say, of this group. Then either the suffixes i, 7, k are common to three other triads or 
one, 7, say, is common to another triad and j, k common to a third. In either case evaluation of the 
expectation shows it to be zero. We can therefore restrict our attention to the case where there are m 
groups each containing two triads. Such groups cai only be of the form Q3.5, Qies Qisa> Qi2s Qias and the 
expectations of the two latter are zero whilst the expectation of Q3,, is 


4(1—24)*+ (2)? = #. 


The number of groups is m and the number of ways of choosing m such distinct pairs out of (2Q,;,)*" 


2m 
is —— so that the leading term in sp,,, is 


* "mi 
2m)! 
(™) ) ami on" sm 
whilst the leading term in p, is #yn* and so 
Ham _ (2m)! 
a 2m" 


The distribution therefore tends to normality. 


REFERENCES 


KENDALL, M. G. & Basrneton Smrru, B. (1940). Biometrika, 31, 324. 
KENDALL, M. G. (1943). The Advanced theory of Statistics, 1. London: Charles Griffin and Co. 


Notes on the calculation of autocorrelations of linear 
autoregressive schemes 


By M. H. QUENOUILLE 


1. Bartlett (1946) has recently shown how, for a series of observations, we can test whether the 
observations can be adequately represented by a linear autoregressive scheme 
Ung 5 + Oy Uggs) + --- $OjUn = Enay, (1) 


where the a; are known or fitted values, and ¢,,,; is an error component independent of u,,,,_,. Bartlett's 
test is based on the formula 





1 rey 
COV (755 7544) ~ n—8; xX PtPire 
=—@ 


where r, is the estimate of the true autocorrelation p, between u, and u,,,. 
@ 
The purpose of the note is to demonstrate how, using generating functions, p; and > £;P;4; can 
i=-—@ 


be calculated with the minimum of computation. 


2. The method of generating functions seems to have been used by Wold (1938), who applied them to 
finding the variances and covariances of linear forms of finite extent in variables such.as €,,;. We shall, 
however, be concerned with linear forms of infinite extent. 

It can easily be shown that the solution of (1) can be written 


Un = €, +b, €,_, + O,€n_9+--- (2) 
where (1+a,¢+...+a,#)7 = ‘s wilepatiatle: (3) 
For example, if onze $+ OUny; + bu, = Enie, 
ain 39 sin 30 
we have (1+at+bé)— = (1—22c0s8+ 2%) = 14—— and z2+—— and a+ ..., 
where cos 6 = —4a/,/b, x = t./b, 
and hence b, = be ae at ——— sin l6. 


sin® ~ 4b—a?) —a*) 








366 Miscellanea 
3. Using this generating function we have 


S c c\ 4) 
o > ptt= Lt [urace..taerft +a2+...+a,(5) }] ‘ (4) 
c>1 t i 


i=—o 


Now the expansion of (4) can be achieved by splitting into partial fractions and, in general, we can let 
¢ -> 1 before this operation is performed. Thus 











v _ AgtAyt+...+ Az, | By + By t+..-+B,O 7 (5) 
(l+a,t+...+a,;t)(O+a,0-+...4a;) 9 1 +a,t+... +a, +a,-1+...4a,; 
and using p; = — p_;, we can see that 
=> B,—a,B,/a; (*= 1,...j-—1). 
Thus the autocorrelations will be generated by 
wv t B,+B,t+...4+ Be 1 By+B,t1+...4+B,t-i (6) 
A, l+a,t+...4a;t Agt 1+a,t-1+...4a,t-5 ’ 


where the first term is expanded in powers of ¢ and the second term is expanded in powers of ¢-. 


@ 
4. The expression (6) can now be squared to give a generating function for > /;/;,;- It will be 


i=-—@ 

ecessary t lit : 

eu ae ¢B,+ B,t+...+ Bt) (B,e-1+ Bt? +...+ B;) 
(l+a,t+...+a,#) (@+a,-1+...+4a,) 





into partial fractions, but the labour will be reduced since the matrix of the coefficients of the equations 
in B, will be unaltered. 


5. To illustrate the method, we can consider Kendall’s series 1, which was used by Bartlett in his 
example. 


The autoregressive scheme for this series is 
Unse— L-luy,, +0°5u, = Enye, 


so that ao? Er p t* = a _ 2B, + (By, +2-2B,)¢ | Byt+ B,t 
two) (1 — 18 + 0-582) (@—1-1840-5) 1 — 1-184 0-58 ~~ 1-124 0-8" 


where By] = [ —3-7692 2-1154] [0 
B | — 21154 — 1-4423 1 


= 2-1154]. 
— 1-4423 





Thus 





o? = 2-8846, 
ee 0-7333—0-5t 1 0-7333—0-5¢-2 
d t= 1+et = . 7 
a Te + y+ 0-68 * 7 1-1 40-8 (7) 
so that Pi—l-lp,.+0-5p;,.=0 (¢>0). (8) 


If we now consider the square of the expression (7) we have a product term 


2u(0-7333 — 0-5t) (0-7333¢—0-5) _ —2B,+(B,+2-2B,)¢ B,+B,t 





(1—1-1¢+ 0-5e2) (®—1-124+0-5) 1 — 1-18 + 0-502 #—1-1t4+0-5’ 
where By] = [[-3-7692 211547] [—0-7333 
B, —2:1154 —1-4423 1-5754 


05686 
—0-7210 |’ 


il 











and, ‘ 


Fro! 


th 


— he 2 a 





4) 


ot 


5) 


B) 








Miscellanea 367 








eo @o 
and, if we write P= = pipes = Pi. 
i=-—@ =-@ 
-~ 0-7333 — 0-5¢ 0-7333 — 0-5¢ \? 
P.t#=1 2 
=f 2: = 14 088 (= aa 
0-5686 — 0-72102 
1-4420 + ¢ ——____- in ¢-? 
. ae 
2-0352 — 1-7210 0-5377 — 0-7333¢ + 0-252? 
= 2-4420 “ 
+87 o6e 1° t+ 0-58) 
+ terms in ¢-!, (9) 
@ 
From this we have > p}?=2-4420 and the ‘correlations’ P, of the correlations are 0-8334, 0-4321, 
i=-—@ 
0-0006,.... Successive terms may be calculated using the relation 


P,—2-21P,_, +2-2P,_.—1-1P,_3+0-25P,,=0 (é>0). (10) 


The calculation - 7 P}, suggested by Bartlett, can also be made by this method, but it is more 
arduous, and the first Sow: terms will give a good approximation. 


7. The same method can‘ be used to calculate the appropriate number of degrees of freedom for 
testing the correlation between two linear autoregressive schemes. 
n 
>> U, OY 
In general, if H(u,u,;)=p,0%, E(v,v;)=pjjo% and r= Ln | 


n n ® 
J( Ea S ot) 
i=1 i=1 





n n 


>» aT Pis Pis 


then varr~ — i=) 
= Pi E Pe 
i=1 i=1 

For linear autoregressive schemes, 0,5 = Pj; Pj; = P;-3, and thus 


n+(n—1)p1Pi t+---+Pn—Pr-r 
n? 


varr~ 





ro) 
~ = Ppipjin. 
i=—o@ 


@ 
Thus, provided n is large, r can be tested withn/ S ,;p; degrees of freedom, and the calculation 


i=-—@ 


@ 
of = pip; can be made by the above method. 


t‘=-@ 


8. Finally, it is worth noting that, for autoregressive schemes involving m observables, it is possible 
to extend this method by the use of m parameters to calculate the correlations within and between the 
observables, provided that adequate estimates of the coefficients of the equations are available. In 
practice, however, the procedure will often be reversed, and estimates of the coefficients of the auto- 
regressive schemes will be obtained by equating the theoretical and observed correlations. 


REFERENCES 


BaRttett, M. S. (1946). J.R. statist. Soc. Suppl. 8, 27. 
Woxp, H. (1938). A Study in the Analysis of Stationary Time-series. Uppsala. 





368 Miscellanea 


Approximate formulae for the percentage points of the incomplete beta function 
end of the x? distribution 


By D. HALTON THOMSON 


Valuable ‘Tables of percentage points of the Incomplete Beta Function’ have been published in Bio- 
metrika (Thompson, 1941a) giving numerical values of percentage points at various probability levels 
between P = 0-995 and P = 0-005 for degrees of freedom v, = 2q and vy, = 2p ranging up to 12), and with 
an accuracy of five significant figures. In the same volume, a ‘Table of the percentage points of the x? 
distribution’ was also published (Thompson, 19416) for values ai the same probability levels and degrees 
of freedom ranging up to vy = 100, and with an accuracy of six significant figures, thus supplementing the 
table of that function originally due to R. A. Fisher (Fisher & Yates, 1938). 

Cases arise in practice where the tails of the frequency distribution of a large population are of special 
interest, thus involving (in the case of the beta-function) values of 2p larger than 120, with a small 2g, 
or vice versa. Harmonic interpolation between 120 and infinity, however, leads to substantial errors, 
as is found when the values of the percentage points z are expressed in terms of their tail values (x or 
1—2z<0-5). This Note shows that close approximations to such extreme values may be determined by 
using the x? table as an auxiliary table to extend the Beta Function Tables in co: ijunction with certain 
simple alternative formulae. Comparisons within the range of the published Beta Function Tables are 
made indicating the degree of accuracy within that range. The accuracy of these formulae beyond that 
range increases rapidly with increasing 2p and decreasing 2g (and vice versa), so that they can be applied 
with confidence under such conditions. 

The ‘normalized’ form of the Incomplete Beta Function, in the usual notation, is 
Ftd (5-4) aye 1 
eC ” 


in which, for a given P, 1—2(q, p) in the tables denotes the upper percentage point and 2( p,q) the lower 
percentage point. 

It is known that, when p is large and q is small compared with p, this form tends towards the Incomplete 
Gamma Function 


P=I1,(p,q) = 


where 2(p,q) = e~*. This in turn may be transformed to the x* distribution by putting pt = [x2(P)]/2. 
For a given large p and small g, therefore, the percentage point in terms of x* is given approximately by 


w(2p, 2q) = exp [ ="). (2) 


where 2q = v in the x* table. This expression gives the exact value of x, when 2g = 2, but for larger 2¢ 
the error, which is consistently negative, increases rapidly with increasing 2g unless 2p is very large— 
much larger than 120. It is, therefore, of limited practical use. The following modifications were in 
consequence evolved. 
APPROXIMATION A 
Consider the constant of integration in the original form (1) which, when expanded, is 
(p+¢q—1)(p+q—2)...(p+1)p 

Iq) 

Let the terms q—1, q—2, ..., 1, 0 be wveraged; the constant as a first approximation then becomes 
{p+ i(q—1)} 
Tq) 
The numerator suggests that a more accurate apy: “ximation for x would be obtained by substituting 

p+3(q—1) in place of p in (2), thus leading to 


P 
2(2p, 29) = exp| — ,2°) ; (A) 











The 


wh 





(A) 





Miscellanea 369 


A comparison of the approximate values of x obtained from (A) with the exact values in the Beta 
Function Tables, for all probability levels between P = 0-995 and P = 0-005, shows that: 


(a) The error is consistently positive, but much smaller than the negative error in (2); in other words 
the latter is slightly over-corrected. 


(6) For a given p/g and varying P, the error is nearly constant; it is smallest at P = 0-995 and increases 
gradually in the direction of P = 0-005. 

(c) For a given P, the error decreases rapidly with increasing 2p and/or decreasing 2g. 

(d) Provided that p/q is larger than 4, the value of x is within 0-5% of the exact tail value; if p/q is 
larger than 10, the error is within 0-1 % of that value. 


APPROXIMATION B 
The exponent in (A) may be written 
waP) _x&(P) 24 
2p+q—1 2g 2p+q-1 








_ XP) 2 
* =| 


The factor in square brackets is equivalent to the first term in the known expansion of the form 


n—1 J l 
log (“—* n ) =~ staat} 


where n = (2p + 2q—1)/2g, which converges rapidly when n is large; i.e. when 2p is large compared 
with 2g. The above exponent may therefore be written 


— Blog ( 2p—1 ) 








2q 2p+2q—--1)° 
: ‘ : ae ng OY ol 
which, when inserted in (A),leadsto 2x(2p,2q¢)Z (=? 5 p+ 2q— ~). (B) 


where k = [x3(P)]/(2¢)- 

A similar comparison with the Beta Function Tables, for the same range of probability levels, shows 
that: 

(a) Approximation (B) gives generally more accurate values than (A), except when 2g is very small, 
in which case they are nearly identical. 

(b) For a given p/g and varying P, the error is negligible in the vicinity of P = 0-25; it increases 
negatively in the direction of P = 0-995, and positively in the direction of P = 0-005, the largest errors 
occurring at this level. 

(c) For a given P, the error decreases rapid)y with increasing 2p and/or decreasing 2g. 

(d) Provided that (2p)*/(2q)* is larger than about 150, the values of z are within 0-5 % of the exact 
tail value; this implies that if 2p is larger than about 150, this degree of accuracy is attained even when 
p/q is as low as unity. If (2p)*/(2q)* is larger than about 2000, the error is within 0-1 % of the exact value, 
which implies that if 2p is larger than about 120, this degrée of accuracy is attained when p/q is as low as 4. 

It will be observed that, when 2g = 2, the formula does not revert exactly to (2), asis required by theory; 
but, unless 2p is also quite small, the error in the computed value of z is negligible. 


The expansion of (B) leads to x(2p, 2g) X e- —9(1 + $e) v +20, 
— —XetP) mats “Ei 
where °"* Sa} and *= 54 2q-1" 


thus demonstrating its analogies with Campbell’s formula (C) below. 


ADAPTATION OF CAMPBELL’S FORMULA 


In a book concerned primarily with quality control, Simon (1941) quotes (without the proof) u formula, 
due to Campbell (1923), designed to determine the average number of defectives in a sample of n, starting 
from the known average number in an infinite sample. It is a particular application of the general 
problem now under consideration, namely, the approximate determination of the percentage points of 





370 Miscellanea 


the Beta Function, starting from the corresponding known values for the x* form of the Poisson exponen- 
tial binomial summation. It is given in the following form: 


P)— 
a * 2 ee *) = An-4 py[l44*+ (80+2) A +a]n-*+..., (3) 





where a(c,n, P) = average number of defectives in which P is the probability of at least c defectives in 
a sample of n, a(c, 00, P) = average number of defectives in an infinite sample, A = }(c—a—1), in which 
a = a(c, 0, P). (Simon quotes a = (a, 00, P), which is an evident misprint.) 

If G denotes the value given by the formula, then 

a(c,n, P) = a(c, 0, P)(1+@), 

so that 1+G is the factor by which the average number of defectives in an infinite sample must be 
multiplied to give that in a sample of n. 

The change from Campbell's notation to the more familiar general notation is given by 


a(c,n, P) = {1—2(2p,2g)}n, a@ = alc, oo, P) = [x3,(P)]/2, 


where n=pt+q-—l, andc=gq. 
Let u=aln and r= (c—1)/(2n), 
then A =n(r—wu/2). 


By inserting this notation in (3) and rearranging the terms, the formula leads to 


x(2p, 2q) x 1—{14r(1 +ir¢ s)}u+a +4n) dur... 


which expression includes the first four terms in the expansion of e-*. 


Hence, for the determination of the percentage points 2(2p, 2¢), Campbell’s formula may, in effect, 
be re-written as 





r I 
x(2p, 24) ze+—r (1 “r+ g) ut tne, » (C) 
_ Xa P) _ q-l 
where “= 3(p+q-1) and "= 2pt+q—-1) 
For large 2p and small 2q, the last two terms become negligible, in which case it reduces to 
a(2p, 2g) S e-* —r(1 + Fr) u. (C’) 


CocHRAN’S APPROXIMATION 


Cochran (1940), extending a method of Fisher’s (1925), has introduced a useful approximation for 
the percentage points of the Incomplete Beta Function, when both p and q are large, his method being 
to determine a sufficiently accurate value of z, as used in Fisher’s z-transformation. 

If y is the normal deviate at probability level P, then for a given pair of arguments 2p, 2q, the following 
are first calculated, using Hartley’s (1941) notation: 
8pq 
2p + 2q° 

y (A—%) (A —2p) 
+ ° 

WA—A) pA 


. 2p 
H , by Fisher’s t fi tion, 2p, 2q) = — . D 
ence, by Fisher’s transformation x(2p, 2q) 2p + 2ge™ ( ) 


A= Hy" +3), A= 


z= 








CoMPARISON OF FORMULAE 


Table 1 compares the various formulae for upper percentage points at an extreme probability level 
(P = 0-995). Table 2 indicates their relative accuracy on a common basis, namely, as a percentage of the 
exact value of x or 1—2, whichever is the smaller, so that the deviations from the exact values, when 
x or 1—z approach zero, are duly emphasized. For intermediate probability levels, the percentages lie 
between the tabulated extremes. It will be noted that in the case of approximation B the errors pass 
through zero near the mid-range of P; in the cases of A and C the errors are positive for all values of P. 























lich 


ct, 


(C) 


C’) 


for 


D) 


Miscellanea 
The general conclusions from these tables and other comparisons are that, for a given probability 


level P: 


(a) When p/g> about 6, approximations A, B and C have about the same degree of accuracy, so that 
the simpler, A or B, have the advantage. 


(b) In the range 6>p/q>4, hnen ty Bill ta. Giseie beewome B and C; but B is the simpler. 


371 


(c) When p/q< 4 and the distribution approaches symmetry, D gives the best results, provided that 
2p and 2q are moderately large, say > 50. It may be, however, that B in this range will be sufficiently 
accurate for many purposes; if p/q> 2, the maximum error of z is about 2 units in the third decimal place. 


Table 1. Comparison of approximate formulae at a given probability level 











P=0-995 
x(2p, 2q) 
2p 2q 
A B Cc D Exact 
(Campbell) (Cochran) 
120 2 0-9*16461 0-9416459 0-9416461 0-999862 0-9*16461 
Nil — 0-0400002 Nil — 0-000054 — 
4 0-9982908 0-9982906 0-9982907 0-997926 0-9982907 
+0-0000001 — 0-0000001 Nil — 0-000355 — 
10 0-982764 0-982755 0-982760 0-982076 0-982759 
+ 0-000005 — 0-000004 + 0-000001 — 0-000683 —_— 
20 0-944002 0-943893 0-943941 0-943366 0943930 
+ 0-000072 — 0-000037 + 0-000011 — 0-000564 —_ 
30 0-902230 * 0-901839 0-902000 0-901551 0-901950 
+ 0-000280 —0-000111 + 0-000050 — 0-000399 — 
40 0-86160 0-86070 0-86106 0-86066 0-86093 
+ 0-00067 — 0-00023 + 0-00013 — 0-00027 — 
60 0-78782 0:78522 0-78621 0-78568 0-78579 
+ 0-00203 — 0-00057 + 0-00042 —0-00011 
120 0-62598 0-61430 0-61855 0-61620 0-61620 
+ 0-00978 — 0-00190 + 0-00235 Nil — 





























N.B. The figures in italics are the differences between the approximate and exact values. 


Table 2. Relative accuracy of approximate formulae 
Error of x(2p, 2q) expressed as a percentage of the smaller exact tail value (z or 1—z< 0-5) 























T 
A B Cc D 
(Campbell) (Cochran) 
2p | 2q Ee 
T 
pla <i 0-995 | 0-500 | 0-005 | 0-995 | 0-500 | 0-005 | 0-995 | 0-500 | 0-005 | 0-995 | 0-500 | 0-005 
% % % % % % % % % % % % 
120; 12{ 10 * * * * * * * * * —2-8| —0-1| +0-3 
20 6 +01] +02) +0-2) —0-1 s +01 * * +0-1) -1-0 24 +01 
40 3 | +05] +0-6] +0-7| —0-2 bg +0-2/ +01] +0:1/ +03) —0O2 * * 
60 2 | +09] +1-1] +1-2| —0-3 ° +02) 40-2) +01) +05) —0O1 * m5 
120 1 +2-5| +2°7| +44] —0°5 * +0-7| +0-6/] +05) +1-7 -: ag 
30 3! 10 * * +01 * * * * * +01) —30-0) —1-4)] -—7-1 
5 6 | +0-1) +0-1|) +03) —0-1|) —0-1| +01 * a +04] —11-1) —0-5/ —18 
10 3 | +0-4/] +05) +0-8) —0:3/) —0-1|) +03) + +01) 41-3) —2-2| —O0-1) —05, 
15 2 | +08] +1-0] +1-9| —0-5| —0-1] +06 He +01) 42-8) —Q7 * —03 
30 1 +24] +2:7/) +7-1) —1-:0/ —O-1/) +18) +10) +05) +8-2 bg * e 


















































* Error smaller than + 0-05 %. 











372 Miscellanea 


Witson-HIL¥FERTY APPROXIMATION FOR X?-ADJUSTMENT 
This formula (Wilson & Hilferty, 1931) for the percentage points of the x* distribution is 


2 2 3 
MAP) = vi » + / 5) ’ 
where v represents. the degrees of freedom, and yp the standardized normal deviate corresponding to 
probability level P. A table has been published in Biometrika (Merrington, 1941), comparing the 
approximations derived from this formula with the exact values, at various probability levels between 
P = 0-995 and P = 0-005. It shows the remarkable accuracy of the formula, the maximum errors 
varying from about + 0-04, when v = 30, to about + 0-024, when v = 100. 

When these errors were plotted against the exact values on logarithmic paper, it was observed that 
for a given probability level, they varied inversely with ,/v very closely. It follows that this square root 
relation may be used to adjust the Wilson-Hilferty formula, bringing the values computed therefrom 
still nearer to the exact values. 

If the difference (at v = 30) between the Wilson-Hilferty value and the exact value, when multiplied 
by ./(100/30), is treated as a coefficient C (which may be positive or negative), the required adjustment 
for any value of v is given by > 
Adjust uent = C/./v. 


For various probability levels P, the values of C are given in the following table: 








' Cc 9 Cc 
0-995 + 0-233 0-250 + 0-039 
0-990 + 0-157 0-100 + 0-056 
0-975 + 0-067 0-050 + 0-035 
0-950 +0011 - 0-025 —0-015 
0-900 — 0-029 0-010 — 0-120 
0-750 — 0-046 0-005 — 0-227 
0-500 —0-013 —_ _ 




















A test against the Merrington Table shows that this adjustment leads to values of y*, between v = 30 
and v = 100 at all probability levels with an accuracy of + 0-001, i.e. to four or five significant figures. 
Since the Wilson-Hilferty approximation assumes a normal distribution about 1 — 2/(9v), which tends 
to unity as v increases to infinity, and since the adjustment tends to zero under those conditions, it 
follows that the latter may aiso be safely applied for an indefinitely large v. 


It should be added that an adjustment on similar principles is not applicable to the Fisher approxi- 
mation for y*. 


REFERENCES 
CaMPBELL, G. A. (1923). Bell Syst. Tech. J. January. 


Cocuran, W. G. (1940). Note on an approximate formula for significance levels of z. Ann. Math. 
Statist. 11, 93. 


FisHer, R. A. (1925). Statistical Methods for Research Workers. Edinburgh: Oliver and Boyd. 
Ist edition. 


FisHer, R. A: & Yates, F. (1938), Statistical Tables for Biological Agricultural and Medical Research. 
Edinburgh: Oliver and Boyd. 

Harrtey, H. O. (1941). Tables of percentage points of the Incomplete Beta Function. Methods of 
interpolation. Biometrika, 32, 166. 

MERRINGTON, M. (1941). Numerical approximations to the percentage points of the x* distribution.. 
Biometrika, 32, 200. y 

Smvon, L. E. (1941). An Engineer’s Manual of Statistical Methods, p. 185. New York: John Wiley 
and Sons, Inc. 

Tompson, C. M. (1941a, 6b). Tables of percentage points of the Incomplete Beta Function, Biometrika, 
32, 151. 

Tuompson, C. M. (1941). Table of percentage points of the x* distribution. Biometrika, 32, 187. 


Wrison, E. B. & Hivrerty, M. M. (1931). The distribution of chi-square. Proc. Nat. Acad. Sci. 
Wash. 17, 684. 











mdipeame bP 


@2@mmianeanoeawe & 





h. 


of 





[ 373 ] 


REVIEWS 


A First Course in Mathematical Statistics. By C. E. WeaTHersurn. Cambridge 
University Press. Price 15s. 


An outstanding feature of the present statistical time is the number of text-books which are being 
written, and each one from a slightly different point of view. It is this which makes statistical theory 
interesting to study, for there can be no rigid approach to a subject which is used and expounded by so 
many and diverse persons. Professor Weatherburn has taken a rather formal mathematical exposition 
of the subject, and mathematical students will find his book both interesting and profitable to read. 
Numerical examples are given for the reader to apply the appropriate mathematical technique. It is 
possible that these would have been of greater utility if they had contained the material in its crude state, 
and had not been streamlined so that the application of the technique is immediately obvious, but 
nevertheless many new examples are there. 

I am not sure whether this book will be entirely useful to students of other subjects than mathematics. 
While the mathematical analysis is undoubtedly clear it is possible that many will not be able to follow 
it in detail, and the conclusions of the analysis are not emphasized strongly. We may contrast with this 
Fisher’s Statistical Methods for Research Workers, where no analysis is given, but where the relevant 
formulae and their interpretation are stated unmistakeably and their applications to material in its 
crude state set out so that the student may calculate for himself. 

Probability theory is the foundation stone on which the whole of statistical theory is built. It is dis- 
appojnting therefore to find that it is given somewhat perfunctory treatment in one chapter and the part 
it plays in (say) statistical tests of significance is not brought out and emphasized. There is a tendency 
nowadays in applying statistical technique to regard the 5 % and 1 % levels of significance as sacrosanct 
and those coming fresh to the subject should learn that custom is the only reason for their choice. 

In spite of the criticisms which I make, however, I would recommend this book to students who have 
obtained some idea of the aims and objectives of statistical theory, and who are desirous of learning the 
development of the mathematical technique as well as its application. Professor Weatherburn’s mathe- 
matical analysis makes pleasant reading and may well throw new light on old methods for those who 
have learnt the rudiments of the theory. 


¥F. N. DAVID 


Advances in Genetics, Volume 1. New York, N.Y.: Academic Press. 1947. 


This is the first number of a new periomeal, probably an annual, summarizing recent work in various 
fields. Of the nine articles, ranging from 12 to 96 pages, with mean 42-6, s.D. 7-89, and a positively skew 
distribution, perhaps the most interesting to European geneticists will be that on the genetics of the 
ciliate Protozoa, Paramecium and Euplotes. Here Sonneborn describes work almost entirely done in 
America, with very surprising results. Thus Paramecium aurelia consists of ai least seven endogamous 
varieties, each with two exogamous mating types, which might be called sexes were it not that in 
P. bursaria one of the varieties has no less than eight mating types. 

Shrode and Lush’s article of the genetics of cattle gives a very condensed account of the large amount 
of work which has been done on the inheritance of economically important characters such as milk yield 
and growth rate. For example cattle biometricians have used the important concept of ‘heritability’, 
meaning the fraction of the variance of a character due to additive genetic differences. Within a herd 
this rarely exceeds 30 %. More space is devoted to work on the genetics of colour and the like, which is 
of far less economic importance, and the review of progeny testing methods is disappointingly brief. 
However, the bibliographical references will be useful. Similarly, Atwood’s article on forage crops, though 
most valuable as a guide to the literature, does not give a detailed account of any of the biometric work 
which has been done on grasses and clovers. 

Only two of the papers give data which a biometrician could immediately utilize. These are Gordon’s 
account of polymorphism in fish populations, and Spencer’s of mutaticns in wild Drosophila species, 
which unfortunately does not include some valuable recent Italian and Russian work. Gordon’s results 
call for the development of methods of estimating gene frequency similar to those used with human blood 
groups. Spencer is mainly concerned with results, but these are often given in sufficient detail to interest 
biometricians, though no attempt is made to summarize Wright’s fundamental statistical theory. 





374 Reviews 


The other articles will be less attractive to biometricians, though it is of interest to see how statistical 
methods are demanded by the mere fact that the genus Crepis, whose evolution is reviewed by Babcock, 
includes 196 species, most of which heve been examined cytologically, and between which 130 of the 
38,220 possible crosses have been made. 

The volume will be indispensable to geneticists. Biometricians certainly cannot neglect it. 


J.B.S. H. 


Mathematical Methods of Statistics. By H. Cramér, Princeton University Press. 1946. 
$6.00. 


This book was written by Prof. Cramér during the war and has been published first in Sweden and then. 


by an offset process by the Princeton University Press in the U.S.A. It is a definitive exposition of the 
theory of mathematical statistics as it existed in 1940 (about) and it is worth while therefore to consider 
its contents in some detail. Prof. Cramér has divided his exposition into three parts; the first part is 
purely mathematical. The theory of sets and of such Lebesgue measure as is necessary for the under- 
standing of the second part is developed first of all. Such a development will be useful for the student 
of mathematical statistics coming fresh to the theory of measure in that he receives guidance as to what 
are the elements essential for him to understand.. Chapters 11 and 12 on matrices determinants and 
quadratic forms and miscellaneous complements do not fit into this general scheme but have obviously 
been included here as part of the mathematical equipment necessary for the student. Possibly Chapter 10 
on Fourier Integrals would have fitted more naturally into Part II but this is a matter of taste. 

Part II begins with a formal development of the theory of probability as given by the French and 
Russian schools of probability, and which Prof. Cramér has already given in his Cambridge tract ‘Random 
Variables and Probability Distributions’. The treatment here seems simpler, however, than in his earlier 
tract and there is a more practical flavour to his exposition. This part while still purely mathematical 
begins to introduce distributions and ideas which are familiar to the statistician. 

The title of the third part is ‘Statistical Inference’ and the main outline is that of small sample theory 
developed during the past twenty-five years. The illustrations are numerical as well as mathematical 
and an attempt is made to show the student the numerical applications of the processes through which 
his mathematical theory leads him. The treatment is not exhaustive but the student who has assimilated 
this part will have little difficulty in extending his knowledge by further reading. 

As a textbook of mathematical statistics this book will remain unrivalled for many years to come. The 
mathematical exposition is clear, the development of ideas logical throughout, and the theorems are 
presented in a very general way. Any student of mathematics who wishes to get a picture of what statis- 
tical theory is about will be led inevitably to a study of this book. To those who wish to become statisticians 
it will be necessary to supplement the reading by a practical course in which the mathematical tools 
are tried out on numerical examples. This aspect of statistical work the book does not cover, but it is 


obvious that this would be the case from the title. It only remains to say to the student ‘This is a good 
book, buy it’. ¥F. N. DAVID 


CORRIGENDA 
(Biometrika, 34, 176-7) 


In J. Wishart’s paper on ‘The cumulants of the z and of the logarithmic x? and ¢ distributions’, 
the following correction should be made: 


p. 176, Ist line of section 3: read ‘log| t |’ for ‘log t’, in two places. 
p. 177, 1st line following equation (32): read ‘log | x |’ for ‘log x’. 

















(All Rights reserved) 


BIOMETRIKA. Vol. XXXIV, Parts III and IV 
CONTENTS 


On the distribution of the rank correlation coefficient 7 when the variates are not independent. 
By Wasstty Horrpinc : 183-196 


The significance of rank correlations where parental correlation ‘exists. By H. E. DanrExs 
and M. G. KENDALL : 197-208 


Testing for normality. By R. C. “Gary . : 209-242 
The stratified semi-stationary population. By S. Vaspa 3 243-254 
A simple approach to confounding and fractional replication in factorial experiments. By oO. 

KEMPTHORNE 
A comparison of stratified ‘with ‘unreeteibted sendont sampling from a , finite e population. By 

P. ARMITAGE 273-280 
Some theorems on time series. I. By P. A. P. Moran < 281-291 
Rank correlation between two variables, one of which is conkied: ‘the shes dithiobiinesen: By 

J. W. WHITFIELD : . : 292-296 
The variance of 7 when both rankings ‘contain ties. By M. G. KENDALL . - ‘ . 297-298 
A x ‘smooth’ test for goodness of fit. By F. N. Davin P ; ‘ a $ ‘ ° 299-310 
An exact test for the equality of variances. By R. L. PLackett . 311-319 
The estimation from individual records of the — between dose and quantal seupoicee. 

By D. J. Finney 3 320-334 
A power function for tests of randomness in & sequence of alternatives. By F. N. Davip ‘ 335-339 
A numerical solution of the problem of moments. By H. O. Hartiry and S. H. Kuamis. 340-351 
Approximation to percentage points of the z-distribution. By A. H. Carter . 3 : 352-358 
MISCELLANEA : 

Note on the cumulants of Fisher’s z-distribution. By Leo A. ARomAN . : : : 359-360 

A note on the mean deviation from the median. By K. R. Nam . : ; ‘ : 360-362 

On the method of paired comparisons. By P. A. P. Moran . : 363-365 

Notes on the calculation of autocorrelations of linear autoregressive schemes. By M. H. 

QUENOUILLE 
Approximate formulae for the percentage pointe of the incomplete beta function and of 
the x? distribution. By D. Hatton Taomson ; $ . . “ ‘ ‘ 368-372 

REVIEWS 

A First Course in Mathematical Statistics . : ° ° . ° ; . 373 

Advances in Genetics 373 

Mathematical Methods of Statistics 374 


PAGE 


255-272 


365-367 





A volume of Biometrika contains about 400 pages, with plates and tables, and it is hoped that in future this will be 
published annually in two half-yearly issues. 


Papers for publication should either be sent to 
PROFESSOR E. 8S. PEARSON, Department of Statistics, University College, London, W.C. 1, 
or if more convenient may ‘be submitted through a member of the Editorial Committee, viz. 
Proressor Haratp Cxamér, University of Stockholm, Sweden. 
Dr R. C. Guary, Statistics Branch, Department of Industry and Commerce, Dublin. 
Proressor M. Greenwoop, F.R.S., London School of Hygiene and Tropical Medicine, London, W.C. 1. 
Prorrssor J. B. S. Hatpans, F.R.S., University College, London, W.C. 1. 
Dr G. M. Morant, R.A.F. Institute of Aviation Medicine, R.A.F. Station, Farnborough, Hants. 
Dr Jonw Wisnart, School of Agriculture, Cambridge. 

It is a condition of publication in Biometrika that the pc ver shall not already have been issued elsewhere, and will not be 
reprinted without leave of the /fditors. 

Contributors receive 25 copies of their papers free. Joint authors 15 copies each. 

The subscription price, payable in advance, is Inland 45s. net per volume and Abroad 54s. net (including packing and 
postaye). Owing to the scarcity of early volumes, the following rates must now be charged for complete sets. Vols. I—X X XIV, 
including XX*: £126. 5s. in wrappers, not including postage. At present certain volumes are out of print, but steps are being 
taken to re-issue these as quickly as printing facilities permit. Recent volumes may still be obtained at the wrapper price; 
this is 64s. inland, including postage. Index to Vols. I to V, 2s. net. Index to Vols. I to XV, 5s. net. Cheques must be 
made payable to Biometrika, crossed “a/e Biometrika Trust” and sent to The Secretary, Biometrika Office, Department of 
Statistics, University College, London, W.C. 1, to whom all orders for series, single copies and offprints should be addressed. 
All foreign cheques must be drawn in sterling and on a Bank having a London Agency. 


First printed in Great Britain at the University Prees, Cambridge 
Reprinted by offeet-litho by Percy Lund Humphries & Co., Ltd. 














a 











